aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-11-02ifcvt/vect: Emit COND_OP for conditional scalar reduction.Robin Dapp9-55/+539
As described in PR111401 we currently emit a COND and a PLUS expression for conditional reductions. This makes it difficult to combine both into a masked reduction statement later. This patch improves that by directly emitting a COND_ADD/COND_OP during ifcvt and adjusting some vectorizer code to handle it. It also makes neutral_op_for_reduction return -0 if HONOR_SIGNED_ZEROS is true. gcc/ChangeLog: PR middle-end/111401 * internal-fn.cc (internal_fn_else_index): New function. * internal-fn.h (internal_fn_else_index): Define. * tree-if-conv.cc (convert_scalar_cond_reduction): Emit COND_OP if supported. (predicate_scalar_phi): Add whitespace. * tree-vect-loop.cc (fold_left_reduction_fn): Add IFN_COND_OP. (neutral_op_for_reduction): Return -0 for PLUS. (check_reduction_path): Don't count else operand in COND_OP. (vect_is_simple_reduction): Ditto. (vect_create_epilog_for_reduction): Fix whitespace. (vectorize_fold_left_reduction): Add COND_OP handling. (vectorizable_reduction): Don't count else operand in COND_OP. (vect_transform_reduction): Add COND_OP handling. * tree-vectorizer.h (neutral_op_for_reduction): Add default parameter. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c: New test. * gcc.target/riscv/rvv/autovec/cond/pr111401.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_call-2.c: Adjust. * gcc.target/riscv/rvv/autovec/reduc/reduc_call-4.c: Ditto.
2023-11-02tree-optimization/112320 - bougs debug IL after SCCPRichard Biener7-39/+41
The following addresses wrong debug IL created by SCCP rewriting stmts to defined overflow. I addressed another inefficiency there but needed to adjust the API of rewrite_to_defined_overflow for this which is now taking a stmt iterator for in-place operation and a stmt for sequence producing because gsi_for_stmt doesn't work for stmts not in the IL. PR tree-optimization/112320 * gimple-fold.h (rewrite_to_defined_overflow): New overload for in-place operation. * gimple-fold.cc (rewrite_to_defined_overflow): Add stmt iterator argument to worker, define separate API for in-place and not in-place operation. * tree-if-conv.cc (predicate_statements): Simplify. * tree-scalar-evolution.cc (final_value_replacement_loop): Likewise. * tree-ssa-ifcombine.cc (pass_tree_ifcombine::execute): Adjust. * tree-ssa-reassoc.cc (update_range_test): Likewise. * gcc.dg/pr112320.c: New testcase.
2023-11-02i386: Move stack protector patterns above mov $0 -> xor peepholeUros Bizjak1-136/+135
Move stack protector patterns above mov $0,%reg -> xor %reg,%reg so the later won't interfere with stack protector peephole2s. gcc/ChangeLog: * config/i386/i386.md: Move stack protector patterns above mov $0,%reg -> xor %reg,%reg peephole2 pattern.
2023-11-02Make GCN target effective-target 'vect_gather_load_ifn'Thomas Schwinge1-0/+1
This fixes: PASS: gcc.dg/vect/vect-gather-2.c (test for excess errors) -FAIL: gcc.dg/vect/vect-gather-2.c scan-tree-dump vect "different gather base" -FAIL: gcc.dg/vect/vect-gather-2.c scan-tree-dump vect "different gather scale" PASS: gcc.dg/vect/vect-gather-2.c scan-tree-dump-not vect "Loop contains only SLP stmts" ..., and enables other test cases. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_gather_load_ifn): True for GCN target.
2023-11-02Support cmul{_conj}v4hf3/cmla{_conj}v4hf4 with AVX512FP16 instruction.liuhongt2-0/+126
gcc/ChangeLog: * config/i386/mmx.md (cmlav4hf4): New expander. (cmla_conjv4hf4): Ditto. (cmulv4hf3): Ditto. (cmul_conjv4hf3): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/part-vect-complexhf.c: New test.
2023-11-02c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342]Jakub Jelinek4-11/+258
The following patch implements C++26 unevaluated-string. As it seems to me just extra pedanticity, it is implemented only for -std=c++26 or -std=gnu++26 and later and only if -pedantic/-pedantic-errors. Nothing is done for inline asm, while the spec changes those, it changes it to a balanced token sequence with implementation defined rules on what is and isn't allowed (so pedantically accepting asm ("" : "+m" (x)); was accepts-invalid before C++26, but we didn't diagnose anything). For the other spots mentioned in the paper, static_assert message, linkage specification, deprecated/nodiscard attributes it enforces the requirements (no prefixes, udlit suffixes, no octal/hexadecimal escapes (conditional escape sequences were rejected with pedantic already before). For the deprecated operator "" identifier case I've kept things as is, because everything seems to have been diagnosed already (a lot being implied from the string having to be empty). 2023-11-02 Jakub Jelinek <jakub@redhat.com> PR c++/110342 gcc/cp/ * parser.cc: Implement C++26 P2361R6 - Unevaluated strings. (uneval_string_attr): New enumerator. (cp_parser_string_literal_common): Add UNEVAL argument. If true, pass CPP_UNEVAL_STRING rather than CPP_STRING to cpp_interpret_string_notranslate. (cp_parser_string_literal, cp_parser_userdef_string_literal): Adjust callers of cp_parser_string_literal_common. (cp_parser_unevaluated_string_literal): New function. (cp_parser_parenthesized_expression_list): Handle uneval_string_attr. (cp_parser_linkage_specification): Use cp_parser_unevaluated_string_literal for C++26. (cp_parser_static_assert): Likewise. (cp_parser_std_attribute): Use uneval_string_attr for standard deprecated and nodiscard attributes. gcc/testsuite/ * g++.dg/cpp26/unevalstr1.C: New test. * g++.dg/cpp26/unevalstr2.C: New test. * g++.dg/cpp0x/udlit-error1.C (lol): Expect an error for C++26 about user-defined literal in deprecated attribute. libcpp/ * include/cpplib.h (TTYPE_TABLE): Add CPP_UNEVAL_STRING literal entry. Use C++11 instead of C++-0x in comments. * charset.cc (convert_escape): Add UNEVAL argument, if true, pedantically diagnose numeric escape sequences. (cpp_interpret_string_1): Formatting fix. Adjust convert_escape caller. (cpp_interpret_string): Formatting string. (cpp_interpret_string_notranslate): Pass type through to cpp_interpret_string if it is CPP_UNEVAL_STRING.
2023-11-02RISC-V: Fix redundant attributesJuzhe-Zhong1-2/+2
Notice that there are some reundant 'vimov' codes in attribute. Committed as it is obvious. gcc/ChangeLog: * config/riscv/vector.md: Fix redundant codes in attributes.
2023-11-02RISC-V: Support vcreate intrinsics for non-tuple typesxuli6-131/+357
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/288 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Expand non-tuple intrinsics. * config/riscv/riscv-vector-builtins-functions.def (vcreate): Define non-tuple intrinsics. * config/riscv/riscv-vector-builtins-shapes.cc (struct vcreate_def): Ditto. * config/riscv/riscv-vector-builtins.cc: Add arg types. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/tuple_create.c: Rename to vcreate.c. * gcc.target/riscv/rvv/base/vcreate.c: New test.
2023-11-02VECT: Refine the type size restriction of call vectorizerPan Li1-13/+9
Update in v4: * Append the check to vectorizable_internal_function. Update in v3: * Add func to predicate type size is legal or not for vectorizer call. Update in v2: * Fix one ICE of type assertion. * Adjust some test cases for aarch64 sve and riscv vector. Original log: The vectoriable_call has one restriction of the size of data type. Aka DF to DI is allowed but SF to DI isn't. You may see below message when try to vectorize function call like lrintf. void test_lrintf (long *out, float *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_lrintf (in[i]); } lrintf.c:5:26: missed: couldn't vectorize loop lrintf.c:5:26: missed: not vectorized: unsupported data-type Then the standard name pattern like lrintmn2 cannot work for different data type size like SF => DI. This patch would like to refine this data type size check and unblock the standard name like lrintmn2 on conditions. The type size of vectype_out need to be exactly the same as the type size of vectype_in when the vectype_out size isn't participating in the optab selection. While there is no such restriction when the vectype_out is somehow a part of the optab query. The below test are passed for this patch. * The risc-v regression tests. * Ensure the lrintf standard name in risc-v. The below test are ongoing. * The x86 bootstrap and regression test. * The aarch64 regression test. gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_internal_function): Add type size check for vectype_out doesn't participating for optab query. (vectorizable_call): Remove the type size check. Signed-off-by: Pan Li <pan2.li@intel.com>
2023-11-02RISC-V: Allow dest operand and accumulator operand overlap of widen ↵Juzhe-Zhong3-2/+56
reduction instruction[PR112327] Consider this following intrinsic code: void rvv_dot_prod(int16_t *pSrcA, int16_t *pSrcB, uint32_t n, int64_t *result) { size_t vl; vint16m4_t vSrcA, vSrcB; vint64m1_t vSum = __riscv_vmv_s_x_i64m1(0, 1); while (n > 0) { vl = __riscv_vsetvl_e16m4(n); vSrcA = __riscv_vle16_v_i16m4(pSrcA, vl); vSrcB = __riscv_vle16_v_i16m4(pSrcB, vl); vSum = __riscv_vwredsum_vs_i32m8_i64m1(__riscv_vwmul_vv_i32m8(vSrcA, vSrcB, vl), vSum, vl); pSrcA += vl; pSrcB += vl; n -= vl; } *result = __riscv_vmv_x_s_i64m1_i64(vSum); } https://godbolt.org/z/vWd35W7G6 Before this patch: ... Loop: ... vmv1r.v v2,v1 ... vwredsum.vs v1,v8,v2 ... After this patch: ... Loop: ... vwredsum.vs v1,v8,v1 ... PR target/112327 gcc/ChangeLog: * config/riscv/vector.md: Add '0'. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr112327-1.c: New test. * gcc.target/riscv/rvv/base/pr112327-2.c: New test.
2023-11-02Daily bump.GCC Administrator6-1/+164
2023-11-01PR target/110551: Tweak mulx register allocation using peephole2.Roger Sayle2-2/+38
This patch is a follow-up to my previous PR target/110551 patch, this time to address the additional move after mulx, seen on TARGET_BMI2 architectures (such as -march=haswell). The complication here is that the flexible multiple-set mulx instruction is introduced into RTL after reload, by split2, and therefore can't benefit from register preferencing. This results in RTL like the following: (insn 32 31 17 2 (parallel [ (set (reg:DI 4 si [orig:101 r ] [101]) (mult:DI (reg:DI 1 dx [109]) (reg:DI 5 di [109]))) (set (reg:DI 5 di [ r+8 ]) (umul_highpart:DI (reg:DI 1 dx [109]) (reg:DI 5 di [109]))) ]) "pr110551-2.c":8:17 -1 (nil)) (insn 17 32 9 2 (set (reg:DI 0 ax [107]) (reg:DI 5 di [ r+8 ])) "pr110551-2.c":9:40 90 {*movdi_internal} (expr_list:REG_DEAD (reg:DI 5 di [ r+8 ]) (nil))) Here insn 32, the mulx instruction, places its results in si and di, and then immediately after decides to move di to ax, with di now dead. This can be trivially cleaned up by a peephole2. I've added an additional constraint that the two SET_DESTs can't be the same register to avoid confusing the middle-end, but this has well-defined behaviour on x86_64/BMI2, encoding a umul_highpart. For the new test case, compiled on x86_64 with -O2 -march=haswell: Before: mulx64: movabsq $-7046029254386353131, %rdx mulx %rdi, %rsi, %rdi movq %rdi, %rax xorq %rsi, %rax ret After: mulx64: movabsq $-7046029254386353131, %rdx mulx %rdi, %rsi, %rax xorq %rsi, %rax ret 2023-11-01 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/110551 * config/i386/i386.md (*bmi2_umul<mode><dwi>3_1): Tidy condition as operands[2] with predicate register_operand must be !MEM_P. (peephole2): Optimize a mulx followed by a register-to-register move, to place result in the correct destination if possible. gcc/testsuite/ChangeLog PR target/110551 * gcc.target/i386/pr110551-2.c: New test case.
2023-11-01RISC-V: Use riscv_subword_address for atomic_test_and_setPatrick O'Neill1-24/+17
Other subword atomic patterns use riscv_subword_address to calculate the aligned address, shift amount, mask and !mask. atomic_test_and_set was implemented before the common function was added. After this patch all subword atomic patterns use riscv_subword_address. gcc/ChangeLog: * config/riscv/sync.md: Use riscv_subword_address function to calculate the address and shift in atomic_test_and_set. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-11-01RISC-V: Enable ztso tests on rv32Patrick O'Neill29-29/+68
This patch transitions the ztso testcases to use the testsuite infrastructure, enabling the tests on both rv64 and rv32 targets. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-ztso-amo-add-1.c: Add Ztso extension to dg-options for dg-do compile. * gcc.target/riscv/amo-table-ztso-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-amo-add-5.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-fence-5.c: Ditto. * gcc.target/riscv/amo-table-ztso-load-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-load-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-load-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-store-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-store-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-store-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: Ditto. * lib/target-supports.exp: Add testing infrastructure to require the Ztso extension or add it to an existing -march. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-11-01RISC-V: fix TARGET_PROMOTE_FUNCTION_MODE hook for libcallsVineet Gupta1-2/+3
Fixes: 3496ca4e6566 ("RISC-V: Add runtime invariant support") riscv_promote_function_mode doesn't promote a SI to DI for libcalls case. It intends to do that however the code is broken (regression). The fix is what generic promote_mode () in explow.cc does. I really don't understand why the old code didn't work, but stepping thru the debugger shows old code didn't and fixed does. This showed up when testing Ajit's REE ABI extension series which probes the ABI (using a NULL tree type) and ends up hitting the libcall code path. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_promote_function_mode): Fix mode returned for libcall case. Tested-by: Patrick O'Neill <patrick@rivosinc.com> # pre-commit-CI #526 Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
2023-11-01c: Add Walloc-size to warn about insufficient size in allocations [PR71219]Martin Uecker5-0/+94
Add option Walloc-size that warns about allocations that have insufficient storage for the target type of the pointer the storage is assigned to. Added to Wextra. PR c/71219 gcc: * doc/invoke.texi: Document -Walloc-size option. gcc/c-family: * c.opt (Walloc-size): New option. gcc/c: * c-typeck.cc (convert_for_assignment): Add warning. gcc/testsuite: * gcc.dg/Walloc-size-1.c: New test. * gcc.dg/Walloc-size-2.c: New test.
2023-11-01Make genautomata.cc output reflect insn-attr.h expectationEdwin Lu1-1/+1
genautomata was writing the insn_has_dfa_reservation_p function inside of the CPU_UNITS_QUERY conditional when it shouldn't have. Move insn_has_dfa_reservation_p outside of conditional group. gcc/ChangeLog: * genautomata.cc (write_automata): move endif Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
2023-11-01omp: Reorder call for TARGET_SIMD_CLONE_ADJUSTAndre Vieira1-96/+145
This patch moves the call to TARGET_SIMD_CLONE_ADJUST until after the arguments and return types have been transformed into vector types. It also constructs the adjuments and retval modifications after this call, allowing targets to alter the types of the arguments and return of the clone prior to the modifications to the function definition. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_return_type): Hoist out code to create return array and don't return new type. (simd_clone_adjust_argument_types): Hoist out code that creates ipa_param_body_adjustments and don't return them. (simd_clone_adjust): Call TARGET_SIMD_CLONE_ADJUST after return and argument types have been vectorized, create adjustments and return array after the hook. (expand_simd_clones): Call TARGET_SIMD_CLONE_ADJUST after return and argument types have been vectorized.
2023-11-01i386: Fix stack protector peephole2 operand predicate [PR112332]Uros Bizjak1-1/+1
PR target/112332 gcc/ChangeLog: * config/i386/i386.md (stack_protexct_set_2 peephole2): Use general_gr_operand as operand 4 predicate.
2023-11-01i386: Improve stack protector patterns and peephole2sUros Bizjak1-65/+54
Improve stack protector patterns and peephole2s to substitute stack protector scratch register clear with unrelated subsequent register initialization in several ways: a. Explicitly generate scratch register as named pseudo. This allows optimizers to eventually reuse the zero value in the register. b. Allow scratch register in different mode (SWI48) than PTR mode: d000: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax d007: 00 00 d009: 48 89 44 24 08 mov %rax,0x8(%rsp) d00e: 8b 87 e0 01 00 00 mov 0x1e0(%rdi),%eax SImode moves on x86 zero-extend to the whole DImode register, so stack protector paranoia is not compromised. c. Relax peephole2 constraint that stack protector scratch register must match new initialized register. This relaxation substantially improves peephole2 opportunities, and generates sequences like: a310: 65 4c 8b 34 25 28 00 mov %gs:0x28,%r14 a317: 00 00 a319: 4c 89 74 24 08 mov %r14,0x8(%rsp) a31e: 4c 8b b7 98 00 00 00 mov 0x98(%rdi),%r14 We have to ensure the new scratch is dead in front of the sequence. The patch also fixes omission of earlyclobbers for all alternatives of new initialized register in *stack_protect_set_3, avoiding the need for reg_overlap_mentioned_p constraint. Earlyclobbers are per alternative, not per operand. Also, instructions are already valid in peephole2 pass, so we don't have to explicitly re-check their operands for validity. gcc/ChangeLog: * config/i386/i386.md (stack_protect_set): Explicitly generate scratch register in word mode. (@stack_protect_set_1_<mode>): Rename to ... (@stack_protect_set_1_<PTR:mode>_<SWI48:mode>): ... this. Use SWI48 mode iterator to match scratch register. (stack_protexct_set_1 peephole2): Use PTR, W and SWI48 mode iterators to match peephole sequence. Use general_operand predicate for operand 4. Allow different operand 2 and operand 3 registers and use peep2_reg_dead_p to ensure new scratch register is dead before peephole seqeunce. Use peep2_reg_dead_p to ensure old scratch register is dead after peephole sequence. (*stack_protect_set_2_<mode>): Rename to ... (*stack_protect_set_2_<mode>_si): .. this. (*stack_protect_set_3): Rename to ... (*stack_protect_set_2_<mode>_di): ... this. Use PTR mode iterator to match stack protector memory move. Use earlyclobber for all alternatives of operand 1. (stack_protexct_set_2 peephole2): Use PTR, W and SWI48 mode iterators to match peephole sequence. Use general_operand predicate for operand 4. Allow different operand 2 and operand 3 registers and use peep2_reg_dead_p to ensure new scratch register is dead before peephole seqeunce. Use peep2_reg_dead_p to ensure old scratch register is dead after peephole sequence.
2023-11-01PR modula2/102989: reimplement overflow detection in ztype though ↵Gaius Mulley9-46/+93
WIDE_INT_MAX_PRECISION The ZTYPE in iso modula2 is used to denote intemediate ordinal type const expressions and these are always converted into the approriate language or user ordinal type prior to code generation. The increase of bits supported by _BitInt causes the modula2 largeconst.mod regression failure tests to pass. The largeconst.mod test has been increased to fail, however the char at a time overflow check is now too slow to detect failure. The overflow detection for the ZTYPE has been rewritten to check against exceeding WIDE_INT_MAX_PRECISION (many orders of magnitude faster). gcc/m2/ChangeLog: PR modula2/102989 * gm2-compiler/SymbolTable.mod (OverflowZType): Import from m2expr. (ConstantStringExceedsZType): Remove import. (GetConstLitType): Replace ConstantStringExceedsZType with OverflowZType. * gm2-gcc/m2decl.cc (m2decl_ConstantStringExceedsZType): Remove. (m2decl_BuildConstLiteralNumber): Re-write. * gm2-gcc/m2decl.def (ConstantStringExceedsZType): Remove. * gm2-gcc/m2decl.h (m2decl_ConstantStringExceedsZType): Remove. * gm2-gcc/m2expr.cc (m2expr_StrToWideInt): Rewrite to check overflow. (m2expr_OverflowZType): New function. (ToWideInt): New function. * gm2-gcc/m2expr.def (OverflowZType): New procedure function declaration. * gm2-gcc/m2expr.h (m2expr_OverflowZType): New prototype. gcc/testsuite/ChangeLog: PR modula2/102989 * gm2/pim/fail/largeconst.mod: Updated foo to an outrageous value. * gm2/pim/fail/largeconst2.mod: Duplicate test removed. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-11-01RISC-V: Support vundefine intrinsics for tuple typesxuli4-0/+89
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/288 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-functions.def (vundefined): Add vundefine intrinsics for tuple types. * config/riscv/riscv-vector-builtins.cc: Ditto. * config/riscv/vector.md (@vundefined<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/tuple_vundefined.c: New test.
2023-11-01NFC: Fix whitespaceJuzhe-Zhong1-1/+1
Notice there is a whitspace issue in previous commit: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f66b2fc122b8a17591afbb881d580b32e8ddb708 Sorry for missing fixing this whitespace. Committed as it is obvious. gcc/ChangeLog: * tree-vect-slp.cc (vect_build_slp_tree_1): Fix whitespace.
2023-11-01Daily bump.GCC Administrator7-1/+622
2023-10-31analyzer: move class record_layout to its own .h/.ccDavid Malcolm4-131/+218
No functional change intended. gcc/ChangeLog: * Makefile.in (ANALYZER_OBJS): Add analyzer/record-layout.o. gcc/analyzer/ChangeLog: * record-layout.cc: New file, based on material in region-model.cc. * record-layout.h: Likewise. * region-model.cc: Include "analyzer/record-layout.h". (class record_layout): Move to record-layout.cc and .h Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-10-31libcpp: eliminate MACRO_MAP_EXPANSION_POINT_LOCATIONDavid Malcolm4-7/+7
This patch eliminates the function "MACRO_MAP_EXPANSION_POINT_LOCATION" (which hasn't been a macro since r6-739-g0501dbd932a7e9) in favor of a new line_map_macro::get_expansion_point_location accessor. No functional change intended. gcc/c-family/ChangeLog: * c-warn.cc (warn_for_multistatement_macros): Update for removal of MACRO_MAP_EXPANSION_POINT_LOCATION. gcc/cp/ChangeLog: * module.cc (ordinary_loc_of): Update for removal of MACRO_MAP_EXPANSION_POINT_LOCATION. (module_state::note_location): Update for renaming of field. (module_state::write_macro_maps): Likewise. gcc/ChangeLog: * input.cc (dump_location_info): Update for removal of MACRO_MAP_EXPANSION_POINT_LOCATION. * tree-diagnostic.cc (maybe_unwind_expanded_macro_loc): Likewise. libcpp/ChangeLog: * include/line-map.h (line_map_macro::get_expansion_point_location): New accessor. (line_map_macro::expansion): Rename field to... (line_map_macro::mexpansion): Rename field to... (MACRO_MAP_EXPANSION_POINT_LOCATION): Delete this function. * line-map.cc (linemap_enter_macro): Update for renaming of field. (linemap_macro_map_loc_to_exp_point): Update for removal of MACRO_MAP_EXPANSION_POINT_LOCATION. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-10-31opts.cc: fix comment about DOCUMENTATION_ROOT_URLDavid Malcolm1-3/+3
gcc/ChangeLog: * opts.cc (get_option_url): Update comment; the requirement to pass DOCUMENTATION_ROOT_URL's value via -D was removed in r10-8065-ge33a1eae25b8a8. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-10-31pretty-print: gracefully handle null URLsDavid Malcolm2-2/+59
gcc/ChangeLog: * pretty-print.cc (pretty_printer::pretty_printer): Initialize m_skipping_null_url. (pp_begin_url): Handle URL being null. (pp_end_url): Likewise. (selftest::test_null_urls): New. (selftest::pretty_print_cc_tests): Call it. * pretty-print.h (pretty_printer::m_skipping_null_url): New. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-10-31VECT: Support SLP MASK_LEN_GATHER_LOAD with conditional maskJuzhe-Zhong2-2/+21
This patch leverage current MASK_GATHER_LOAD to support SLP MASK_LEN_GATHER_LOAD with condtional mask. Unconditional MASK_LEN_GATHER_LOAD (base, offset, scale, zero, -1) SLP is not included in this patch since it seems that we can't support it in the middle-end: FAIL: gcc.dg/tree-ssa/pr44306.c (internal compiler error: in vectorizable_load, at tree-vect-stmts.cc:9885) May be we should support GATHER_LOAD explictily in RISC-V backend to walk around this issue. I am gonna support GATHER_LOAD explictly work around in RISC-V backend. This patch also adds conditional gather load test since there is no conditional gather load test. Ok for trunk ? gcc/ChangeLog: * tree-vect-slp.cc (vect_get_operand_map): Add MASK_LEN_GATHER_LOAD. (vect_build_slp_tree_1): Ditto. (vect_build_slp_tree_2): Ditto. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-gather-6.c: New test.
2023-10-31bpf: Improvements in CO-RE builtins implementation.Cupertino Miranda17-341/+1069
This patch moved the processing of attribute preserve_access_index to its own independent pass in a gimple lowering pass. This approach is more consistent with the implementation of the CO-RE builtins when used explicitly in the code. The attributed type accesses are now early converted to __builtin_core_reloc builtin instead of being kept as an expression in code through out all of the middle-end. This disables the compiler to optimize out or manipulate the expression using the local defined type, instead of assuming nothing is known about this expression, as it should be the case in all of the CO-RE relocations. In the process, also the __builtin_preserve_access_index has been improved to generate code for more complex expressions that would require more then one CO-RE relocation. This turned out to be a requirement, since bpf-next selftests would rely on loop unrolling in order to convert an undefined index array access into a defined one. This seemed extreme to expect for the unroll to happen, and for that reason GCC still generates correct code in such scenarios, even when index access is never predictable or unrolling does not occur. gcc/ChangeLog: * config/bpf/bpf-passes.def (pass_lower_bpf_core): Added pass. * config/bpf/bpf-protos.h: Added prototype for new pass. * config/bpf/bpf.cc (bpf_delegitimize_address): New function. * config/bpf/bpf.md (mov_reloc_core<MM:mode>): Prefixed name with '*'. * config/bpf/core-builtins.cc (cr_builtins) Added access_node to struct. (is_attr_preserve_access): Improved check. (core_field_info): Make use of root_for_core_field_info function. (process_field_expr): Adapted to new functions. (pack_type): Small improvement. (bpf_handle_plugin_finish_type): Adapted to GTY(()). (bpf_init_core_builtins): Changed to new function names. (construct_builtin_core_reloc): Improved implementation. (bpf_resolve_overloaded_core_builtin): Changed how __builtin_preserve_access_index is converted. (compute_field_expr): Corrected implementation. Added access_node argument. (bpf_core_get_index): Added valid argument. (root_for_core_field_info, pack_field_expr) (core_expr_with_field_expr_plus_base, make_core_safe_access_index) (replace_core_access_index_comp_expr, maybe_get_base_for_field_expr) (core_access_clean, core_is_access_index, core_mark_as_access_index) (make_gimple_core_safe_access_index, execute_lower_bpf_core) (make_pass_lower_bpf_core): Added functions. (pass_data_lower_bpf_core): New pass struct. (pass_lower_bpf_core): New gimple_opt_pass class. (pack_field_expr_for_preserve_field) (bpf_replace_core_move_operands): Removed function. (bpf_enum_value_kind): Added GTY(()). * config/bpf/core-builtins.h (bpf_field_info_kind, bpf_type_id_kind) (bpf_type_info_kind, bpf_enum_value_kind): New enum. * config/bpf/t-bpf: Added pass bpf-passes.def to PASSES_EXTRA. gcc/testsuite/ChangeLog: * gcc.target/bpf/core-attr-5.c: New test. * gcc.target/bpf/core-attr-6.c: New test. * gcc.target/bpf/core-builtin-1.c: Corrected * gcc.target/bpf/core-builtin-enumvalue-opt.c: Corrected regular expression. * gcc.target/bpf/core-builtin-enumvalue.c: Corrected regular expression. * gcc.target/bpf/core-builtin-exprlist-1.c: New test. * gcc.target/bpf/core-builtin-exprlist-2.c: New test. * gcc.target/bpf/core-builtin-exprlist-3.c: New test. * gcc.target/bpf/core-builtin-exprlist-4.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-offset-1.c: Extra tests
2023-10-31gcc: config: microblaze: fix cpu version checkNeal Frager20-20/+20
The MICROBLAZE_VERSION_COMPARE was incorrectly using strcasecmp instead of strverscmp to check the mcpu version against feature options. By simply changing the define to use strverscmp, the new version 10.0 is treated correctly as a higher version than previous versions. gcc/ChangeLog: * config/microblaze/microblaze.cc: Fix mcpu version check. gcc/testsuite/ChangeLog: * gcc.target/microblaze/isa/bshift.c: Bump to mcpu=v10.0. * gcc.target/microblaze/isa/div.c: Ditto. * gcc.target/microblaze/isa/fcmp1.c: Ditto. * gcc.target/microblaze/isa/fcmp2.c: Ditto. * gcc.target/microblaze/isa/fcmp3.c: Ditto. * gcc.target/microblaze/isa/fcmp4.c: Ditto. * gcc.target/microblaze/isa/fcvt.c: Ditto. * gcc.target/microblaze/isa/float.c: Ditto. * gcc.target/microblaze/isa/fsqrt.c: Ditto. * gcc.target/microblaze/isa/mul-bshift-pcmp.c: Ditto. * gcc.target/microblaze/isa/mul-bshift.c: Ditto. * gcc.target/microblaze/isa/mul.c: Ditto. * gcc.target/microblaze/isa/mulh-bshift-pcmp.c: Ditto. * gcc.target/microblaze/isa/mulh.c: Ditto. * gcc.target/microblaze/isa/nofcmp.c: Ditto. * gcc.target/microblaze/isa/nofloat.c: Ditto. * gcc.target/microblaze/isa/pcmp.c: Ditto. * gcc.target/microblaze/isa/vanilla.c: Ditto. * gcc.target/microblaze/microblaze.exp: Ditto. Signed-off-by: Neal Frager <neal.frager@amd.com> Signed-off-by: Michael J. Eager <eager@eagercon.com>
2023-10-31RISC-V: Require a extension for testcases with atomic insnsPatrick O'Neill25-7/+48
Add testsuite infrastructure for the A extension and use it to require the A extension for dg-do run and add the add extension for non-A dg-do compile. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-a-6-amo-add-1.c: Add A extension to dg-options for dg-do compile. * gcc.target/riscv/amo-table-a-6-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-a-6-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-a-6-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-a-6-amo-add-5.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-1.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-2.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-3.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-4.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-5.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-6.c: Ditto. * gcc.target/riscv/amo-table-a-6-compare-exchange-7.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-1.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-a-6-subword-amo-add-5.c: Ditto. * gcc.target/riscv/inline-atomics-2.c: Ditto. * gcc.target/riscv/inline-atomics-3.c: Require A extension for dg-do run. * gcc.target/riscv/inline-atomics-4.c: Ditto. * gcc.target/riscv/inline-atomics-5.c: Ditto. * gcc.target/riscv/inline-atomics-6.c: Ditto. * gcc.target/riscv/inline-atomics-7.c: Ditto. * gcc.target/riscv/inline-atomics-8.c: Ditto. * lib/target-supports.exp: Add testing infrastructure to require the A extension or add it to an existing -march. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-10-31RISC-V: Let non-atomic targets use optimized amo loads/storesPatrick O'Neill3-6/+6
Non-atomic targets are currently prevented from using the optimized fencing for seq_cst load/seq_cst store. This patch removes that constraint. gcc/ChangeLog: * config/riscv/sync-rvwmo.md (atomic_load_rvwmo<mode>): Remove TARGET_ATOMIC constraint (atomic_store_rvwmo<mode>): Ditto. * config/riscv/sync-ztso.md (atomic_load_ztso<mode>): Ditto. (atomic_store_ztso<mode>): Ditto. * config/riscv/sync.md (atomic_load<mode>): Ditto. (atomic_store<mode>): Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-10-31riscv: thead: Add support for the XTheadFMemIdx ISA extensionChristoph Müllner12-5/+404
The XTheadFMemIdx ISA extension provides additional load and store instructions for floating-point registers with new addressing modes. The following memory accesses types are supported: * load/store: [w,d] (single-precision FP, double-precision FP) The following addressing modes are supported: * register offset with additional immediate offset (4 instructions): flr<type>, fsr<type> * zero-extended register offset with additional immediate offset (4 instructions): flur<type>, fsur<type> These addressing modes are also part of the similar XTheadMemIdx ISA extension support, whose code is reused and extended to support floating-point registers. One challenge that this patch needs to solve are GP registers in FP-mode (e.g. "(reg:DF a2)"), which cannot be handled by the XTheadFMemIdx instructions. Such registers are the result of independent optimizations, which can happen after register allocation. This patch uses a simple but efficient method to address this: add a dependency for XTheadMemIdx to XTheadFMemIdx optimizations. This allows to use the instructions from XTheadMemIdx in case of such registers. The added tests ensure that this feature won't regress without notice. Testing: GCC regression test suite and SPEC CPU 2017 intrate (base&peak). Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> gcc/ChangeLog: * config/riscv/riscv.cc (riscv_index_reg_class): Return GR_REGS for XTheadFMemIdx. (riscv_regno_ok_for_index_p): Add support for XTheadFMemIdx. * config/riscv/riscv.h (HARDFP_REG_P): New macro. * config/riscv/thead.cc (is_fmemidx_mode): New function. (th_memidx_classify_address_index): Add support for XTheadFMemIdx. (th_fmemidx_output_index): New function. (th_output_move): Add support for XTheadFMemIdx. * config/riscv/thead.md (TH_M_ANYF): New mode iterator. (TH_M_NOEXTF): Likewise. (*th_fmemidx_movsf_hardfloat): New INSN. (*th_fmemidx_movdf_hardfloat_rv64): Likewise. (*th_fmemidx_I_a): Likewise. (*th_fmemidx_I_c): Likewise. (*th_fmemidx_US_a): Likewise. (*th_fmemidx_US_c): Likewise. (*th_fmemidx_UZ_a): Likewise. (*th_fmemidx_UZ_c): Likewise. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadfmemidx-index-update.c: New test. * gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: New test. * gcc.target/riscv/xtheadfmemidx-index.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex-update.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: New test. * gcc.target/riscv/xtheadfmemidx-uindex.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2023-10-31riscv: thead: Add support for the XTheadMemIdx ISA extensionChristoph Müllner18-14/+1548
The XTheadMemIdx ISA extension provides a additional load and store instructions with new addressing modes. The following memory accesses types are supported: * load: b,bu,h,hu,w,wu,d * store: b,h,w,d The following addressing modes are supported: * immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions): l<ltype>.ia, l<ltype>.ib, s<stype>.ia, s<stype>.ib * register offset with additional immediate offset (11 instructions): lr<ltype>, sr<stype> * zero-extended register offset with additional immediate offset (11 instructions): lur<ltype>, sur<stype> The RISC-V base ISA does not support index registers, so the changes are kept separate from the RISC-V standard support as much as possible. To combine the shift/multiply instructions into the memory access instructions, this patch comes with a few insn_and_split optimizations that allow the combiner to do this task. Handling the different cases of extensions results in a couple of INSNs that look redundant on first view, but they are just the equivalence of what we already have for Zbb as well. The only difference is, that we have much more load instructions. We already have a constraint with the name 'th_f_fmv', therefore, the new constraints follow this pattern and have the same length as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu'). The added tests ensure that this feature won't regress without notice. Testing: GCC regression test suite, GCC bootstrap build, and SPEC CPU 2017 intrate (base&peak) on C920. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> gcc/ChangeLog: * config/riscv/constraints.md (th_m_mia): New constraint. (th_m_mib): Likewise. (th_m_mir): Likewise. (th_m_miu): Likewise. * config/riscv/riscv-protos.h (enum riscv_address_type): Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG, and ADDRESS_REG_WB and their documentation. (struct riscv_address_info): Add new field 'shift' and document the field usage for the new address types. (riscv_valid_base_register_p): New prototype. (th_memidx_legitimate_modify_p): Likewise. (th_memidx_legitimate_index_p): Likewise. (th_classify_address): Likewise. (th_output_move): Likewise. (th_print_operand_address): Likewise. * config/riscv/riscv.cc (riscv_index_reg_class): Return GR_REGS for XTheadMemIdx. (riscv_regno_ok_for_index_p): Add support for XTheadMemIdx. (riscv_classify_address): Call th_classify_address() on top. (riscv_output_move): Call th_output_move() on top. (riscv_print_operand_address): Call th_print_operand_address() on top. * config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro. (HAVE_PRE_MODIFY_DISP): Likewise. * config/riscv/riscv.md (zero_extendqi<SUPERQI:mode>2): Disable for XTheadMemIdx. (*zero_extendqi<SUPERQI:mode>2_internal): Convert to expand, create INSN with same name and disable it for XTheadMemIdx. (extendsidi2): Likewise. (*extendsidi2_internal): Disable for XTheadMemIdx. * config/riscv/thead.cc (valid_signed_immediate): New helper function. (th_memidx_classify_address_modify): New function. (th_memidx_legitimate_modify_p): Likewise. (th_memidx_output_modify): Likewise. (is_memidx_mode): Likewise. (th_memidx_classify_address_index): Likewise. (th_memidx_legitimate_index_p): Likewise. (th_memidx_output_index): Likewise. (th_classify_address): Likewise. (th_output_move): Likewise. (th_print_operand_address): Likewise. * config/riscv/thead.md (*th_memidx_operand): New splitter. (*th_memidx_zero_extendqi<SUPERQI:mode>2): New INSN. (*th_memidx_extendsidi2): Likewise. (*th_memidx_zero_extendsidi2): Likewise. (*th_memidx_zero_extendhi<GPR:mode>2): Likewise. (*th_memidx_extend<SHORT:mode><SUPERQI:mode>2): Likewise. (*th_memidx_bb_zero_extendsidi2): Likewise. (*th_memidx_bb_zero_extendhi<GPR:mode>2): Likewise. (*th_memidx_bb_extendhi<GPR:mode>2): Likewise. (*th_memidx_bb_extendqi<SUPERQI:mode>2): Likewise. (TH_M_ANYI): New mode iterator. (TH_M_NOEXTI): Likewise. (*th_memidx_I_a): New combiner optimization. (*th_memidx_I_b): Likewise. (*th_memidx_I_c): Likewise. (*th_memidx_US_a): Likewise. (*th_memidx_US_b): Likewise. (*th_memidx_US_c): Likewise. (*th_memidx_UZ_a): Likewise. (*th_memidx_UZ_b): Likewise. (*th_memidx_UZ_c): Likewise. gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadmemidx-helpers.h: New test. * gcc.target/riscv/xtheadmemidx-index-update.c: New test. * gcc.target/riscv/xtheadmemidx-index-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadmemidx-index-xtheadbb.c: New test. * gcc.target/riscv/xtheadmemidx-index.c: New test. * gcc.target/riscv/xtheadmemidx-modify-xtheadbb.c: New test. * gcc.target/riscv/xtheadmemidx-modify.c: New test. * gcc.target/riscv/xtheadmemidx-uindex-update.c: New test. * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb-update.c: New test. * gcc.target/riscv/xtheadmemidx-uindex-xtheadbb.c: New test. * gcc.target/riscv/xtheadmemidx-uindex.c: New test.
2023-10-31rs6000, Add missing overloaded bcd builtin tests, documentationCarl Love2-1/+25
Currently we have the documentation for __builtin_vec_bcdsub_{eq,gt,lt} but not for __builtin_bcdsub_{gl}e, this patch is to supplement the descriptions for them. Although they are mainly for __builtin_bcdcmp{ge,le}, we already have some testing coverage for __builtin_vec_bcdsub_{eq,gt,lt}, this patch adds the corresponding explicit test cases as well. gcc/ChangeLog: * doc/extend.texi (__builtin_bcdsub_le, __builtin_bcdsub_ge): Add documentation for the builti-ins. gcc/testsuite/ChangeLog: * gcc.target/powerpc/bcd-3.c (do_sub_ge, do_suble): Add functions to test builtins __builtin_bcdsub_ge and __builtin_bcdsub_le.
2023-10-31gcc: config: microblaze: fix cpu version checkNeal Frager2-0/+26
The MICROBLAZE_VERSION_COMPARE was incorrectly using strcasecmp instead of strverscmp to check the mcpu version against feature options. By simply changing the define to use strverscmp, the new version 10.0 is treated correctly as a higher version than previous versions. Fix incorrect warning with -mcpu=10.0: warning: '-mxl-multiply-high' can be used only with '-mcpu=v6.00.a' or greater Signed-off-by: Neal Frager <neal.frager@amd.com> Signed-off-by: Michael J. Eager <eager@eagercon.com>
2023-10-31[RA]: Fixing LRA cycling for multi-reg variable containing a fixed regVladimir N. Makarov2-3/+16
PR111971 test case uses a multi-reg variable containing a fixed reg. LRA rejects such multi-reg because of this when matching the constraint for an asm insn. The rejection results in LRA cycling. The patch fixes this issue. gcc/ChangeLog: PR rtl-optimization/111971 * lra-constraints.cc: (process_alt_operands): Don't check start hard regs for regs originated from register variables. gcc/testsuite/ChangeLog: PR rtl-optimization/111971 * gcc.target/powerpc/pr111971.c: New test.
2023-10-31RISC-V: Add vector fmin/fmax expanders.Robin Dapp48-25/+790
This patch adds expanders for fmin and fmax. As per RISC-V V Spec 1.0 vfmin/vfmax are IEEE 754-2019 compliant which differs from IEEE 754-2008 that fmin/fmax require (particularly in the signaling-NaN handling). Therefore the pattern conditions include a !HONOR_SNANS. gcc/ChangeLog: * config/riscv/autovec.md (<ieee_fmaxmin_op><mode>3): fmax/fmin expanders. (cond_<ieee_fmaxmin_op><mode>): Ditto. (cond_len_<ieee_fmaxmin_op><mode>): Ditto. (reduc_fmax_scal_<mode>): Ditto. (reduc_fmin_scal_<mode>): Ditto. * config/riscv/riscv-v.cc (needs_fp_rounding): Add fmin/fmax. * config/riscv/vector-iterators.md (fmin): New UNSPEC. (UNSPEC_VFMIN): Ditto. * config/riscv/vector.md (@pred_<ieee_fmaxmin_op><mode>): Add UNSPEC insn patterns. (@pred_<ieee_fmaxmin_op><mode>_scalar): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Remove -ffast-math. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/fmax-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmax_run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmax_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmax_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin_run-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/fmin_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-4.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh_run-4.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-4.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-3.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh_run-4.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc-10.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_run-10.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh-10.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_zvfh_run-10.c: New test.
2023-10-31genemit: Split insn-emit.cc into several partitions.Robin Dapp8-255/+422
On riscv insn-emit.cc has grown to over 1.2 mio lines of code and compiling it takes considerable time. Therefore, this patch adjust genemit to create several partitions (insn-emit-1.cc to insn-emit-n.cc). The available patterns are written to the given files in a sequential fashion. Similar to match.pd a configure option --with-emitinsn-partitions=num is introduced that makes the number of partition configurable. gcc/ChangeLog: PR bootstrap/84402 PR target/111600 * Makefile.in: Handle split insn-emit.cc. * configure: Regenerate. * configure.ac: Add --with-insnemit-partitions. * genemit.cc (output_peephole2_scratches): Print to file instead of stdout. (print_code): Ditto. (gen_rtx_scratch): Ditto. (gen_exp): Ditto. (gen_emit_seq): Ditto. (emit_c_code): Ditto. (gen_insn): Ditto. (gen_expand): Ditto. (gen_split): Ditto. (output_add_clobbers): Ditto. (output_added_clobbers_hard_reg_p): Ditto. (print_overload_arguments): Ditto. (print_overload_test): Ditto. (handle_overloaded_code_for): Ditto. (handle_overloaded_gen): Ditto. (print_header): New function. (handle_arg): New function. (main): Split output into 10 files. * gensupport.cc (count_patterns): New function. * gensupport.h (count_patterns): Define. * read-md.cc (md_reader::print_md_ptr_loc): Add file argument. * read-md.h (class md_reader): Change definition.
2023-10-31hardcfr: support checking at abnormal edges [PR111943]Alexandre Oliva2-3/+108
Control flow redundancy may choose abnormal edges for early checking, but that breaks because we can't insert checks on such edges. Introduce conditional checking on the dest block of abnormal edges, and leave it for the optimizer to drop the conditional. for gcc/ChangeLog PR tree-optimization/111943 * gimple-harden-control-flow.cc: Adjust copyright year. (rt_bb_visited): Add vfalse and vtrue data members. Zero-initialize them in the ctor. (rt_bb_visited::insert_exit_check_on_edge): Upon encountering abnormal edges, insert initializers for vfalse and vtrue on entry, and insert the check sequence guarded by a conditional in the dest block. for libgcc/ChangeLog * hardcfr.c: Adjust copyright year. for gcc/testsuite/ChangeLog PR tree-optimization/111943 * gcc.dg/harden-cfr-pr111943.c: New.
2023-10-31tree-optimization/112305 - SCEV cprop and conditional undefined overflowRichard Biener4-26/+56
The following adjusts final value replacement to also rewrite the replacement to defined overflow behavior if there's conditionally evaluated stmts (with possibly undefined overflow), not only when we "folded casts". The patch hooks into expression_expensive for this. PR tree-optimization/112305 * tree-scalar-evolution.h (expression_expensive): Adjust. * tree-scalar-evolution.cc (expression_expensive): Record when we see a COND_EXPR. (final_value_replacement_loop): When the replacement contains a COND_EXPR, rewrite it to defined overflow. * tree-ssa-loop-ivopts.cc (may_eliminate_iv): Adjust. * gcc.dg/torture/pr112305.c: New testcase.
2023-10-31d: Clean-up unused variable assignments after interface changeIain Buclaw1-4/+2
The lowering done for invoking `new' on a single dimension array was moved from the code generator to the front-end semantic pass in r14-4996. This removes the detritus left behind in the code generator from that deletion. gcc/d/ChangeLog: * expr.cc (ExprVisitor::visit (NewExp *)): Remove unused assignments.
2023-10-31LoongArch: Define HAVE_AS_TLS to 0 if it's undefined [PR112299]Xi Ruoyao1-0/+4
Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure building a cross compiler if the cross assembler is not installed yet. gcc/ChangeLog: PR target/112299 * config/loongarch/loongarch-opts.h (HAVE_AS_TLS): Define to 0 if not defined yet.
2023-10-31RISC-V: Add assert of the number of vmerge in autovec cond testcasesLehua Ding87-40/+330
This patch adds more asserts about the vmerge insns which is intended to ensure better performance for cond autovec. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_arith-1.c: Add vmerge assert. * gcc.target/riscv/rvv/autovec/cond/cond_arith-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv64gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_shift-9.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_sqrt-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-7.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_unary-8.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_arith-10.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_arith-11.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_arith_run-10.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_arith_run-11.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_fmul_run-5.c: New test.
2023-10-31match.pd: Support combine cond_len_op + vec_cond similar to cond_opLehua Ding2-0/+111
This patch adds combine cond_len_op and vec_cond to cond_len_op like cond_op. Consider this code (RISC-V target): void foo (uint8_t *__restrict x, uint8_t *__restrict y, uint8_t *__restrict z, uint8_t *__restrict pred, uint8_t *__restrict merged, int n) { for (int i = 0; i < n; ++i) x[i] = pred[i] != 1 ? y[i] / z[i] : merged[i]; } Before this patch: ... vect_iftmp.18_71 = .COND_LEN_DIV (mask__31.11_61, vect__5.14_65, vect__7.17_69, { 0, ... }, _86, 0); vect_iftmp.23_78 = .VCOND_MASK (mask__31.11_61, vect_iftmp.18_71, vect_iftmp.22_77); ... After this patch: ... _30 = .COND_LEN_DIV (mask__31.16_61, vect__5.19_65, vect__7.22_69, vect_iftmp.27_77, _85, 0); ... gcc/ChangeLog: * gimple-match.h (gimple_match_op::gimple_match_op): Add interfaces for more arguments. (gimple_match_op::set_op): Add interfaces for more arguments. * match.pd: Add support of combining cond_len_op + vec_cond
2023-10-31Fix incorrect option mask and avx512cd target pushHaochen Jiang3-281/+281
gcc/ChangeLog: * config/i386/avx512cdintrin.h (target): Push evex512 for avx512cd. * config/i386/avx512vlintrin.h (target): Split avx512cdvl part out from avx512vl. * config/i386/i386-builtin.def (BDESC): Do not check evex512 for builtins not needed.
2023-10-31RISC-V: Add the missed combine of [u]int64 -> _Float16 and vcondLehua Ding5-7/+10
Hi, This patch let the INT64 to FP16 convert split to two small converts (INT64 -> FP32 and FP32 -> FP16) when expanding instead of dealy the split to split1 pass. This change could make it possible to combine the FP32 to FP16 and vcond patterns and so we don't need to add an combine pattern for INT64 to FP16 and vcond patterns. Consider this code: void foo (_Float16 *__restrict r, int64_t *__restrict a, _FLoat16 *__restrict b, int64_t *__restrict pred, int n) { for (int i = 0; i < n; i += 1) { r[i] = pred[i] ? (_Float16) a[i] : b[i]; } } Before this patch: ... vfncvt.f.f.w v2,v2 vmerge.vvm v1,v1,v2,v0 vse16.v v1,0(a0) ... After this patch: ... vfncvt.f.f.w v1,v2,v0.t vse16.v v1,0(a0) ... gcc/ChangeLog: * config/riscv/autovec.md (<float_cvt><mode><vnnconvert>2): Change to define_expand. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c: Add vfncvt.f.f.w assert. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c: Ditto.
2023-10-31Fix wrong code due to incorrect define_splitliuhongt3-79/+70
-(define_split - [(set (match_operand:V2HI 0 "register_operand") - (eq:V2HI - (eq:V2HI - (us_minus:V2HI - (match_operand:V2HI 1 "register_operand") - (match_operand:V2HI 2 "register_operand")) - (match_operand:V2HI 3 "const0_operand")) - (match_operand:V2HI 4 "const0_operand")))] - "TARGET_SSE4_1" - [(set (match_dup 0) - (umin:V2HI (match_dup 1) (match_dup 2))) - (set (match_dup 0) - (eq:V2HI (match_dup 0) (match_dup 2)))]) the splitter is wrong when op1 == op2.(the original pattern returns 0, after split, it returns 1) So remove the splitter. Also extend another define_split to define_insn_and_split to handle below pattern 494(set (reg:V4QI 112) 495 (unspec:V4QI [ 496 (subreg:V4QI (reg:V2HF 111 [ bf ]) 0) 497 (subreg:V4QI (reg:V2HF 110 [ af ]) 0) 498 (subreg:V4QI (eq:V2HI (eq:V2HI (reg:V2HI 105) 499 (const_vector:V2HI [ 500 (const_int 0 [0]) repeated x2 501 ])) 502 (const_vector:V2HI [ 503 (const_int 0 [0]) repeated x2 504 ])) 0) 505 ] UNSPEC_BLENDV)) define_split doesn't work since pass_combine assume it produces at most 2 insns after split, but here it produces 3 since we need to move const0_rtx (V2HImode) to reg. The move insn can be eliminated later. gcc/ChangeLog: PR target/112276 * config/i386/mmx.md (*mmx_pblendvb_v8qi_1): Change define_split to define_insn_and_split to handle immediate_operand for comparison. (*mmx_pblendvb_v8qi_2): Ditto. (*mmx_pblendvb_<mode>_1): Ditto. (*mmx_pblendvb_v4qi_2): Ditto. (<code><mode>3): Remove define_split after it. (<code>v8qi3): Ditto. (<code><mode>3): Ditto. (<ode>v2hi3): Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/part-vect-vcondhf.C: Adjust testcase. * gcc.target/i386/pr112276.c: New test.
2023-10-30MATCH: Add some more value_replacement simplifications to matchAndrew Pinski2-0/+54
This moves a few more value_replacements simplifications to match. /* a == 1 ? b : a * b -> a * b */ /* a == 1 ? b : b / a -> b / a */ /* a == -1 ? b : a & b -> a & b */ Also adds a testcase to show can we catch these where value_replacement would not (but other passes would). Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * match.pd (`a == 1 ? b : a OP b`): New pattern. (`a == -1 ? b : a & b`): New pattern. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/phi-opt-value-4.c: New test.