aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-08-30analyzer: implement reference count checking for CPython plugin [PR107646]Eric Feng10-44/+550
This patch introduces initial support for reference count checking of PyObjects in relation to the Python/C API for the CPython plugin. Additionally, the core analyzer underwent several modifications to accommodate this feature. These include: - Introducing support for callbacks at the end of region_model::pop_frame. This is our current point of validation for the reference count of PyObjects. - An added optional custom stmt_finder parameter to region_model_context::warn. This aids in emitting a diagnostic concerning the reference count, especially when the stmt_finder is NULL, which is currently the case during region_model::pop_frame. The current diagnostic we emit relating to the reference count appears as follows: rc3.c:23:10: warning: expected ‘item’ to have reference count: ‘1’ but ob_refcnt field is: ‘2’ 23 | return list; | ^~~~ ‘create_py_object’: events 1-4 | | 4 | PyObject* item = PyLong_FromLong(3); | | ^~~~~~~~~~~~~~~~~~ | | | | | (1) when ‘PyLong_FromLong’ succeeds | 5 | PyObject* list = PyList_New(1); | | ~~~~~~~~~~~~~ | | | | | (2) when ‘PyList_New’ succeeds |...... | 14 | PyList_Append(list, item); | | ~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (3) when ‘PyList_Append’ succeeds, moving buffer |...... | 23 | return list; | | ~~~~ | | | | | (4) here | This is a WIP in several ways: - Currently, functions returning PyObject * are assumed to always produce a new reference. - The validation of reference count is only for PyObjects created within a function body. Verifying reference counts for PyObjects passed as parameters is not supported in this patch. gcc/analyzer/ChangeLog: PR analyzer/107646 * engine.cc (impl_region_model_context::warn): New optional parameter. * exploded-graph.h (class impl_region_model_context): Likewise. * region-model.cc (region_model::pop_frame): New callback feature for region_model::pop_frame. * region-model.h (struct append_regions_cb_data): Likewise. (class region_model): Likewise. (class region_model_context): New optional parameter. (class region_model_context_decorator): Likewise. gcc/testsuite/ChangeLog: PR analyzer/107646 * gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count checking for PyObjects. * gcc.dg/plugin/cpython-plugin-test-2.c: Moved to... * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and added more tests). * gcc.dg/plugin/cpython-plugin-test-1.c: Moved to... * gcc.dg/plugin/cpython-plugin-test-no-Python-h.c: ...here (and added more tests). * gcc.dg/plugin/plugin.exp: New tests. * gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test. * gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test. Signed-off-by: Eric Feng <ef2648@columbia.edu>
2023-08-30Analyzer: include algorithm headerFrancois-Xavier Coudert1-0/+1
gcc/analyzer/ChangeLog: * region-model.cc: Define INCLUDE_ALGORITHM.
2023-08-30pru: Add cstore expansion patternsDimitar Dimitrov9-0/+126
Add cstore patterns for the two specific operations which can be efficiently expanded using the UMIN instruction: X != 0 X == 0 The rest of the operations are rejected, and left to be expanded by the common expansion code. PR target/106562 gcc/ChangeLog: * config/pru/predicates.md (const_0_operand): New predicate. (pru_cstore_comparison_operator): Ditto. * config/pru/pru.md (cstore<mode>4): New pattern. (cstoredi4): Ditto. gcc/testsuite/ChangeLog: * gcc.target/pru/pr106562-10.c: New test. * gcc.target/pru/pr106562-11.c: New test. * gcc.target/pru/pr106562-5.c: New test. * gcc.target/pru/pr106562-6.c: New test. * gcc.target/pru/pr106562-7.c: New test. * gcc.target/pru/pr106562-8.c: New test. * gcc.target/pru/pr106562-9.c: New test. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2023-08-30c++: CWG 2359, wrong copy-init with designated init [PR91319]Marek Polacek2-0/+31
This CWG clarifies that designated initializer support direct-initialization. Just be careful what Note 2 in [dcl.init.aggr]/4.2 says: "If the initialization is by designated-initializer-clause, its form determines whether copy-initialization or direct-initialization is performed." Hence this patch sets CONSTRUCTOR_IS_DIRECT_INIT only when we are dealing with ".x{}", but not ".x = {}". PR c++/91319 gcc/cp/ChangeLog: * parser.cc (cp_parser_initializer_list): Set CONSTRUCTOR_IS_DIRECT_INIT when the designated initializer is of the .x{} form. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/desig30.C: New test.
2023-08-30c++: disallow constinit on functions [PR111173]Marek Polacek2-0/+8
[dcl.constinit]/1: The constinit specifier shall be applied only to a declaration of a variable with static or thread storage duration. and while we detect constinit int fn(); we weren't detecting using F = int(); constinit F f; PR c++/111173 gcc/cp/ChangeLog: * decl.cc (grokdeclarator): Disallow constinit on functions. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/constinit19.C: New test.
2023-08-30tree-optimization/111228 - fix testcaseRichard Biener1-1/+1
* gcc.dg/tree-ssa/forwprop-42.c: Use __UINT64_TYPE__ instead of unsigned long.
2023-08-30test: Add xfail into slp-reduc-7.c for RVV VLA vectorizationJuzhe-Zhong1-1/+1
Like ARM SVE, add RVV variable length xfail. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-reduc-7.c: Add RVV.
2023-08-30test: Adapt slp-26.c check for RVVJuzhe-Zhong1-4/+4
Fix FAILs: FAIL: gcc.dg/vect/slp-26.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 0 loops" 1 FAIL: gcc.dg/vect/slp-26.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 0 FAIL: gcc.dg/vect/slp-26.c scan-tree-dump-times vect "vectorized 0 loops" 1 FAIL: gcc.dg/vect/slp-26.c scan-tree-dump-times vect "vectorizing stmts using SLP" 0 Since RVV is able to vectorize it with VLS modes like amdgcn. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-26.c: Adapt for RVV.
2023-08-30fortran: Restore interface to its previous state on error [PR48776]Mikael Morin4-6/+113
Keep memory of the content of the current interface body being parsed and restore it to its previous state if it has been modified at the time a parse attempt fails. This fixes memory errors and random segmentation faults caused by dangling symbol pointers kept in interfaces' linked lists of symbols. If a parsing attempt fails and symbols are freed, they should also be removed from the current interface linked list. As the list of symbol is a linked list, and parsing only adds new symbols to the head of the list, all that is needed to track the previous content of the list is a pointer to its previous head. This adds such a pointer, and the restoration of the list of symbols to that pointer on error. PR fortran/48776 gcc/fortran/ChangeLog: * gfortran.h (gfc_drop_interface_elements_before): New prototype. (gfc_current_interface_head): Return a reference to the pointer. * interface.cc (gfc_current_interface_head): Ditto. (free_interface_elements_until): New function, generalizing gfc_free_interface. (gfc_free_interface): Use free_interface_elements_until. (gfc_drop_interface_elements_before): New function. * parse.cc (current_interface_ptr, previous_interface_head): New static variables. (current_interface_valid_p, get_current_interface_ptr): New functions. (decode_statement): Initialize previous_interface_head. (reject_statement): Restore current interface pointer to point to previous_interface_head. gcc/testsuite/ChangeLog: * gfortran.dg/interface_procedure_1.f90: New test.
2023-08-30tree-optimization/111228 - combine two VEC_PERM_EXPRsRichard Biener2-3/+155
The following adds simplification of two VEC_PERM_EXPRs where the later one replaces all elements from either the first or the second input of the earlier permute. This allows a three input permute to be simplified to a two input one. I'm following the existing two input simplification case and only allow non-VLA permutes. The now existing three cases and the single case in tree-ssa-forwprop.cc somehow ask for merging, I'm not doing this as part of this change though. PR tree-optimization/111228 * match.pd ((vec_perm (vec_perm ..) @5 ..) -> (vec_perm @x @5 ..)): New simplifications. * gcc.dg/tree-ssa/forwprop-42.c: New testcase.
2023-08-30RISC-V: Remove movmisalign pattern for VLA modesJuzhe-Zhong1-11/+0
This patch fixed this bunch of failures in "vect" testsuite: FAIL: gcc.dg/vect/pr63341-1.c -flto -ffat-lto-objects execution test FAIL: gcc.dg/vect/pr63341-1.c execution test FAIL: gcc.dg/vect/pr63341-2.c -flto -ffat-lto-objects execution test FAIL: gcc.dg/vect/pr63341-2.c execution test FAIL: gcc.dg/vect/pr94994.c -flto -ffat-lto-objects execution test FAIL: gcc.dg/vect/pr94994.c execution test FAIL: gcc.dg/vect/vect-align-1.c -flto -ffat-lto-objects execution test FAIL: gcc.dg/vect/vect-align-1.c execution test FAIL: gcc.dg/vect/vect-align-2.c -flto -ffat-lto-objects execution test FAIL: gcc.dg/vect/vect-align-2.c execution test Spike report: z 0000000000000000 ra 00000000000100f4 sp 0000003ffffffb30 gp 0000000000012cc8 tp 0000000000000000 t0 00000000000102d4 t1 000000000000000f t2 0000000000000000 s0 0000000000000000 s1 0000000000000000 a0 00000000000101a6 a1 0000000000000008 a2 0000000000000010 a3 0000000000012401 a4 0000000000012480 a5 0000000000000020 a6 000000000000001f a7 00000000000000d6 s2 0000000000000000 s3 0000000000000000 s4 0000000000000000 s5 0000000000000000 s6 0000000000000000 s7 0000000000000000 s8 0000000000000000 s9 0000000000000000 sA 0000000000000000 sB 0000000000000000 t3 0000000000000000 t4 0000000000000000 t5 0000000000000000 t6 0000000000000000 pc 00000000000101ec va/inst 000000000206dc07 sr 8000000200006620 Load access fault! (spike) core 0: 0x0000000000010204 (0x02065087) vle16.v v1, (a2) core 0: exception trap_load_address_misaligned, epc 0x0000000000010204 core 0: tval 0x0000000000012c81 (spike) reg 0 a2 0x0000000000012c81 According to RVV ISA, we couldn't use "vle16.v" if the address is byte align. Such issue is caused by this GIMPLE IR: vect__1.15_17 = .MASK_LEN_LOAD (vectp_t.13_15, 8B, { -1, ... }, _24, 0); For partial vectorization, the alignment is "8B" byte align here is incorrect here. After this patch, the vectorization failed: sll a5,a4,0x1 add a5,a5,a1 lhu a3,64(a5) lbu a5,66(a5) addw a4,a4,1 srl a3,a3,0x8 sll a5,a5,0x8 or a5,a5,a3 sh a5,0(a2) add a2,a2,2 bne a4,a0,101f8 <foo+0x14> I will enable auto-vectorization in another approach in the next following patch. gcc/ChangeLog: * config/riscv/autovec.md (movmisalign<mode>): Delete.
2023-08-30test: Fix XPASS of RVVJuzhe-Zhong6-6/+6
XPASS: gcc.dg/vect/vect-outer-4e.c -flto -ffat-lto-objects scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4e.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4f.c -flto -ffat-lto-objects scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4f.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4g.c -flto -ffat-lto-objects scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4g.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4k.c -flto -ffat-lto-objects scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4k.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4l.c -flto -ffat-lto-objects scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 XPASS: gcc.dg/vect/vect-outer-4l.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1 Like ARM SVE, Fix these XPASS for RVV. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-double-reduc-5.c: Add riscv. * gcc.dg/vect/vect-outer-4e.c: Ditto. * gcc.dg/vect/vect-outer-4f.c: Ditto. * gcc.dg/vect/vect-outer-4g.c: Ditto. * gcc.dg/vect/vect-outer-4k.c: Ditto. * gcc.dg/vect/vect-outer-4l.c: Ditto.
2023-08-30test: Add xfail for riscv_vectorJuzhe-Zhong3-3/+3
Like ARM SVE, when we enable scalable vectorization for RVV, we can't do constant fold for these yet for both ARM SVE and RVV. Ok for trunk ? gcc/testsuite/ChangeLog: * gcc.dg/vect/pr88598-1.c: Add riscv_vector. * gcc.dg/vect/pr88598-2.c: Ditto. * gcc.dg/vect/pr88598-3.c: Ditto.
2023-08-30RISC-V: support cm.mva01s cm.mvsa01 in zcmpDie Li5-0/+85
Signed-off-by: Die Li <lidie@eswincomputing.com> Co-Authored-By: Fei Gao <gaofei@eswincomputing.com> gcc/ChangeLog: * config/riscv/peephole.md: New pattern. * config/riscv/predicates.md (a0a1_reg_operand): New predicate. (zcmp_mv_sreg_operand): New predicate. * config/riscv/riscv.md: New predicate. * config/riscv/zc.md (*mva01s<X:mode>): New pattern. (*mvsa01<X:mode>): New pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/cm_mv_rv32.c: New test.
2023-08-30RISC-V: support cm.popretz in zcmpFei Gao5-25/+509
Generate cm.popretz instead of cm.popret if return value is 0. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_zcmp_can_use_popretz): true if popretz can be used (riscv_gen_multi_pop_insn): interface to generate cm.pop[ret][z] (riscv_expand_epilogue): expand cm.pop[ret][z] in epilogue * config/riscv/riscv.md: define A0_REGNUM * config/riscv/zc.md (@gpr_multi_popretz_up_to_ra_<mode>): md for popretz ra (@gpr_multi_popretz_up_to_s0_<mode>): md for popretz ra, s0 (@gpr_multi_popretz_up_to_s1_<mode>): likewise (@gpr_multi_popretz_up_to_s2_<mode>): likewise (@gpr_multi_popretz_up_to_s3_<mode>): likewise (@gpr_multi_popretz_up_to_s4_<mode>): likewise (@gpr_multi_popretz_up_to_s5_<mode>): likewise (@gpr_multi_popretz_up_to_s6_<mode>): likewise (@gpr_multi_popretz_up_to_s7_<mode>): likewise (@gpr_multi_popretz_up_to_s8_<mode>): likewise (@gpr_multi_popretz_up_to_s9_<mode>): likewise (@gpr_multi_popretz_up_to_s11_<mode>): likewise gcc/testsuite/ChangeLog: * gcc.target/riscv/rv32e_zcmp.c: add testcase for cm.popretz in rv32e * gcc.target/riscv/rv32i_zcmp.c: add testcase for cm.popretz in rv32i
2023-08-30RISC-V: support cm.push cm.pop cm.popret in zcmpFei Gao11-52/+2137
Zcmp can share the same logic as save-restore in stack allocation: pre-allocation by cm.push, step 1 and step 2. Pre-allocation not only saves callee saved GPRs, but also saves callee saved FPRs and local variables if any. Please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does. So adaption has been done in .cfi directives in my patch. gcc/ChangeLog: * config/riscv/iterators.md (slot0_offset): slot 0 offset in stack GPRs area in bytes (slot1_offset): slot 1 offset in stack GPRs area in bytes (slot2_offset): likewise (slot3_offset): likewise (slot4_offset): likewise (slot5_offset): likewise (slot6_offset): likewise (slot7_offset): likewise (slot8_offset): likewise (slot9_offset): likewise (slot10_offset): likewise (slot11_offset): likewise (slot12_offset): likewise * config/riscv/predicates.md (stack_push_up_to_ra_operand): predicates of stack adjust pushing ra (stack_push_up_to_s0_operand): predicates of stack adjust pushing ra, s0 (stack_push_up_to_s1_operand): likewise (stack_push_up_to_s2_operand): likewise (stack_push_up_to_s3_operand): likewise (stack_push_up_to_s4_operand): likewise (stack_push_up_to_s5_operand): likewise (stack_push_up_to_s6_operand): likewise (stack_push_up_to_s7_operand): likewise (stack_push_up_to_s8_operand): likewise (stack_push_up_to_s9_operand): likewise (stack_push_up_to_s11_operand): likewise (stack_pop_up_to_ra_operand): predicates of stack adjust poping ra (stack_pop_up_to_s0_operand): predicates of stack adjust poping ra, s0 (stack_pop_up_to_s1_operand): likewise (stack_pop_up_to_s2_operand): likewise (stack_pop_up_to_s3_operand): likewise (stack_pop_up_to_s4_operand): likewise (stack_pop_up_to_s5_operand): likewise (stack_pop_up_to_s6_operand): likewise (stack_pop_up_to_s7_operand): likewise (stack_pop_up_to_s8_operand): likewise (stack_pop_up_to_s9_operand): likewise (stack_pop_up_to_s11_operand): likewise * config/riscv/riscv-protos.h (riscv_zcmp_valid_stack_adj_bytes_p):declaration * config/riscv/riscv.cc (struct riscv_frame_info): comment change (riscv_avoid_multi_push): helper function of riscv_use_multi_push (riscv_use_multi_push): true if multi push is used (riscv_multi_push_sregs_count): num of sregs in multi-push (riscv_multi_push_regs_count): num of regs in multi-push (riscv_16bytes_align): align to 16 bytes (riscv_stack_align): moved to a better place (riscv_save_libcall_count): no functional change (riscv_compute_frame_info): add zcmp frame info (riscv_for_each_saved_reg): save or restore fprs in specified slot for zcmp (riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push (riscv_gen_multi_push_pop_insn): gen function for multi push and pop (get_multi_push_fpr_mask): get mask for the fprs pushed by cm.push (riscv_expand_prologue): allocate stack by cm.push (riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret] (riscv_expand_epilogue): allocate stack by cm.pop[ret] (zcmp_base_adj): calculate stack adjustment base size (zcmp_additional_adj): calculate stack adjustment additional size (riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment valid * config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra (S0_MASK): likewise (S1_MASK): likewise (S2_MASK): likewise (S3_MASK): likewise (S4_MASK): likewise (S5_MASK): likewise (S6_MASK): likewise (S7_MASK): likewise (S8_MASK): likewise (S9_MASK): likewise (S10_MASK): likewise (S11_MASK): likewise (MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most (ZCMP_MAX_SPIMM): max spimm value (ZCMP_SP_INC_STEP): zcmp sp increment step (ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10 (ZCMP_S0S11_SREGS_COUNTS): num of s0-s11 (ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp (CALLEE_SAVED_FREG_NUMBER): get x of fsx(fs0 ~ fs11) * config/riscv/riscv.md: include zc.md * config/riscv/zc.md: New file. machine description for zcmp gcc/testsuite/ChangeLog: * gcc.target/riscv/rv32e_zcmp.c: New test. * gcc.target/riscv/rv32i_zcmp.c: New test. * gcc.target/riscv/zcmp_push_fpr.c: New test. * gcc.target/riscv/zcmp_stack_alignment.c: New test.
2023-08-30tree-ssa-strlen: Fix up handling of conditionally zero memcpy [PR110914]Jakub Jelinek2-1/+24
The following testcase is miscompiled since r279392 aka r10-5451-gef29b12cfbb4979 The strlen pass has adjust_last_stmt function, which performs mainly strcat or strcat-like optimizations (say strcpy (x, "abcd"); strcat (x, p); or equivalent memcpy (x, "abcd", strlen ("abcd") + 1); char *q = strchr (x, 0); memcpy (x, p, strlen (p)); etc. where the first stmt stores '\0' character at the end but next immediately overwrites it and so the first memcpy can be adjusted to store 1 fewer bytes. handle_builtin_memcpy called this function in two spots, the first one guarded like: if (olddsi != NULL && tree_fits_uhwi_p (len) && !integer_zerop (len)) adjust_last_stmt (olddsi, stmt, false); i.e. only for constant non-zero length. The other spot can call it even for non-constant length but in that case we punt before that if that length isn't length of some string + 1, so again non-zero. The r279392 change I assume wanted to add some warning stuff and changed it like if (olddsi != NULL - && tree_fits_uhwi_p (len) && !integer_zerop (len)) - adjust_last_stmt (olddsi, stmt, false); + { + maybe_warn_overflow (stmt, len, rvals, olddsi, false, true); + adjust_last_stmt (olddsi, stmt, false); + } While maybe_warn_overflow possibly handles non-constant length fine, adjust_last_stmt really relies on length to be non-zero, which !integer_zerop (len) alone doesn't guarantee. While we could for len being SSA_NAME ask the ranger or tree_expr_nonzero_p, I think adjust_last_stmt will not benefit from it much, so the following patch just restores the above condition/previous behavior for the adjust_last_stmt call only. 2023-08-30 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/110914 * tree-ssa-strlen.cc (strlen_pass::handle_builtin_memcpy): Don't call adjust_last_stmt unless len is known constant. * gcc.c-torture/execute/pr110914.c: New test.
2023-08-30store-merging: Fix up >= 64 bit insertion [PR111015]Jakub Jelinek2-4/+33
The following testcase shows that we mishandle bit insertion for info->bitsize >= 64. The problem is in using unsigned HOST_WIDE_INT shift + subtraction + build_int_cst to compute mask, the shift invokes UB at compile time for info->bitsize 64 and larger and e.g. on the testcase with info->bitsize happens to compute mask of 0x3f rather than 0x3f'ffffffff'ffffffff. The patch fixes that by using wide_int wi::mask + wide_int_to_tree, so it handles masks in any precision (up to WIDE_INT_MAX_PRECISION ;) ). 2023-08-30 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/111015 * gimple-ssa-store-merging.cc (imm_store_chain_info::output_merged_store): Use wi::mask and wide_int_to_tree instead of unsigned HOST_WIDE_INT shift and build_int_cst to build BIT_AND_EXPR mask. * gcc.dg/pr111015.c: New test.
2023-08-30middle-end: Apply MASK_LEN_LOAD_LANES/MASK_LEN_STORE_LANES to ivopts/aliasJuzhe-Zhong2-0/+7
Like MASK_LOAD_LANES/MASK_STORE_LANES, add MASK_LEN_ variant. Bootstrap and Regression on X86 passed. Ok for trunk? gcc/ChangeLog: * tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Add MASK_LEN_ variant. (call_may_clobber_ref_p_1): Ditto. * tree-ssa-loop-ivopts.cc (get_mem_type_for_internal_fn): Ditto. (get_alias_ptr_type_for_ptr_address): Ditto.
2023-08-30RISC-V: Make arch-24.c to test "success" caseTsukasa OI1-3/+1
arch-24.c and arch-25.c are exactly the same and redundant. The author suspects that the original author intended to test two base ISAs (RV32I and RV64I) so this commit changes arch-24.c to test that RV32I+Zcf does not cause any errors. gcc/testsuite/ChangeLog: * gcc.target/riscv/arch-24.c: Test RV32I+Zcf instead.
2023-08-30RISC-V: Make sure we get VL REG operand for VLMAX vsetvlJuzhe-Zhong1-4/+12
Fix ICE in "vect" testsuite: FAIL: gcc.dg/vect/pr64495.c (internal compiler error: in df_uses_record, at df-scan.cc:2958) FAIL: gcc.dg/vect/pr64495.c (test for excess errors After this patch, all current found VSETVL PASS related bugs in "vect" are fixed. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (vector_insn_info::get_avl_or_vl_reg): Fix bug.
2023-08-30RISC-V: Enable movmisalign for VLS modesJuzhe-Zhong3-5/+57
Prevous patch (which removed VLA modes movmisalign pattern) to fix run-time bug. Such patch disable vectorization for misalign data movement. After I check LLVM codes, LLVM supports misalign for VLS modes. Before this patch: sll a5,a4,0x1 add a5,a5,a1 lhu a3,64(a5) lbu a5,66(a5) addw a4,a4,1 srl a3,a3,0x8 sll a5,a5,0x8 or a5,a5,a3 sh a5,0(a2) add a2,a2,2 bne a4,a0,101f8 <foo+0x14> After this patch: foo: lui a0,%hi(.LANCHOR0) addi a0,a0,%lo(.LANCHOR0) addi sp,sp,-16 addi a1,a0,1 li a2,64 sd ra,8(sp) vsetvli zero,a2,e8,m4,ta,ma addi a0,a0,128 vle8.v v4,0(a1) vse8.v v4,0(a0) call memcmp bne a0,zero,.L6 ld ra,8(sp) addi sp,sp,16 jr ra .L6: call abort Note this patch has passed all testcases in "vect" which are related to alignment. gcc/ChangeLog: * config/riscv/autovec-vls.md (movmisalign<mode>): New pattern. * config/riscv/riscv.cc (riscv_support_vector_misalignment): Support VLS misalign. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/misalign-1.c: New test.
2023-08-30Daily bump.GCC Administrator6-1/+292
2023-08-29RISC-V: Use splitter to generate zicond in another casePhilipp Tomsich2-0/+45
So in analyzing Ventana's internal tree against the trunk it became apparent that the current zicond code is missing a case that helps coremark's bitwise CRC implementation. Here's a minimized testcase: long xor1(long crc, long poly) { if (crc & 1) crc ^= poly; return crc; } ie, it's just a conditional xor. We generate this: andi a5,a0,1 neg a5,a5 and a5,a5,a1 xor a0,a5,a0 ret But we should instead generate: andi a5,a0,1 czero.eqz a5,a1,a5 xor a0,a5,a0 ret Combine wants to generate: Trying 7, 8 -> 9: 7: r140:DI=r137:DI&0x1 8: r141:DI=-r140:DI REG_DEAD r140:DI 9: r142:DI=r141:DI&r144:DI REG_DEAD r144:DI REG_DEAD r141:DI Failed to match this instruction: (set (reg:DI 142) (and:DI (sign_extract:DI (reg/v:DI 137 [ crc ]) (const_int 1 [0x1]) (const_int 0 [0])) (reg:DI 144))) A splitter can rewrite the above into a suitable if-then-else construct and squeeze an instruction out of that pesky CRC loop. Sadly it doesn't really help anything else. The patch includes two variants. One that uses ZBS, the other uses an ANDI logical to produce the input condition. gcc/ * config/riscv/zicond.md: New splitters to rewrite single bit sign extension as the condition to a czero in the desired form. gcc/testsuite * gcc.target/riscv/zicond-xor-01.c: New test. Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
2023-08-29analyzer: new warning: -Wanalyzer-overlapping-buffers [PR99860]David Malcolm11-2/+722
gcc/ChangeLog: PR analyzer/99860 * Makefile.in (ANALYZER_OBJS): Add analyzer/ranges.o. gcc/analyzer/ChangeLog: PR analyzer/99860 * analyzer-selftests.cc (selftest::run_analyzer_selftests): Call selftest::analyzer_ranges_cc_tests. * analyzer-selftests.h (selftest::run_analyzer_selftests): New decl. * analyzer.opt (Wanalyzer-overlapping-buffers): New option. * call-details.cc: Include "analyzer/ranges.h" and "make-unique.h". (class overlapping_buffers): New. (call_details::complain_about_overlap): New. * call-details.h (call_details::complain_about_overlap): New decl. * kf.cc (kf_memcpy_memmove::impl_call_pre): Call cd.complain_about_overlap for memcpy and memcpy_chk. (kf_strcat::impl_call_pre): Call cd.complain_about_overlap. (kf_strcpy::impl_call_pre): Likewise. * ranges.cc: New file. * ranges.h: New file. gcc/ChangeLog: PR analyzer/99860 * doc/invoke.texi: Add -Wanalyzer-overlapping-buffers. gcc/testsuite/ChangeLog: PR analyzer/99860 * c-c++-common/analyzer/overlapping-buffers.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-08-29c++: tweaks for explicit conversion fns diagnosticMarek Polacek4-4/+77
1) When saying that a conversion is erroneous because it would use an explicit constructor, it might be nice to show where exactly the explicit constructor is located. For example, with this patch: [...] explicit.C:4:12: note: 'S::S(int)' declared here 4 | explicit S(int) { } | ^ 2) When a conversion doesn't work out merely because the conversion function necessary to do the conversion couldn't be used because it was marked explicit, it would be useful to the user to say so, rather than just saying "cannot convert". For example, with this patch: explicit.C:13:12: error: cannot convert 'S' to 'bool' in initialization 13 | bool b = S{1}; | ^~~~ | | | S explicit.C:5:12: note: explicit conversion function was not considered 5 | explicit operator bool() const { return true; } | ^~~~~~~~ gcc/cp/ChangeLog: * call.cc (convert_like_internal): Show where the conversion function was declared. (maybe_show_nonconverting_candidate): New. * cp-tree.h (maybe_show_nonconverting_candidate): Declare. * typeck.cc (convert_for_assignment): Call it. gcc/testsuite/ChangeLog: * g++.dg/diagnostic/explicit.C: New test.
2023-08-29RISC-V: Added zvfh support for zfa extensions.Jin Ma4-5/+104
This is a follow-up for the zfa extension, added according to the recommendations for zvfh and patch of Tsukasa OI <research_trasio@irq.a4lg.com>. At the same time, zfa-fli-5.c of which is also based on the patch. Ref: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627284.html https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628492.html gcc/ChangeLog: * config/riscv/riscv.cc (riscv_float_const_rtx_index_for_fli): zvfh can generate zfa extended instruction fli.h, just like zfh. gcc/testsuite/ChangeLog: * gcc.target/riscv/zfa-fli-7.c: Change fa0 to fa\[0-9\] to avoid assigning register numbers that are non-zero. * gcc.target/riscv/zfa-fli-8.c: Ditto. * gcc.target/riscv/zfa-fli-5.c: New test.
2023-08-29RISC-V: generate builtin macro for compilation with strict alignmentEdwin Lu12-0/+144
Distinguish between explicit -mstrict-align and cpu tune param for slow_unaligned_access=true/false. Tested for regressions using rv32/64 multilib with newlib/linux gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Generate __riscv_unaligned_avoid with value 1 or __riscv_unaligned_slow with value 1 or __riscv_unaligned_fast with value 1 * config/riscv/riscv.cc (riscv_option_override): Define riscv_user_wants_strict_align. Set riscv_user_wants_strict_align to TARGET_STRICT_ALIGN * config/riscv/riscv.h: Declare riscv_user_wants_strict_align gcc/testsuite/ChangeLog: * gcc.target/riscv/attribute-1.c: Check for __riscv_unaligned_slow or __riscv_unaligned_fast * gcc.target/riscv/attribute-4.c: Check for __riscv_unaligned_avoid * gcc.target/riscv/attribute-5.c: Check for __riscv_unaligned_slow or __riscv_unaligned_fast * gcc.target/riscv/predef-align-1.c: New test. * gcc.target/riscv/predef-align-2.c: New test. * gcc.target/riscv/predef-align-3.c: New test. * gcc.target/riscv/predef-align-4.c: New test. * gcc.target/riscv/predef-align-5.c: New test. * gcc.target/riscv/predef-align-6.c: New test. Reviewed-by: Jeff Law <jlaw@ventanamicro.com> Signed-off-by: Edwin Lu <ewlu@rivosinc.com> Co-authored-by: Vineet Gupta <vineetg@rivosinc.com>
2023-08-29libgccjit: add support for `restrict` attribute on function parametersGuillaume Gomez12-2/+216
gcc/jit/Changelog: * jit-playback.cc: Remove trailing whitespace characters. * jit-playback.h: Add get_restrict method. * jit-recording.cc: Add get_restrict methods. * jit-recording.h: Add get_restrict methods. * libgccjit++.h: Add get_restrict methods. * libgccjit.cc: Add gcc_jit_type_get_restrict. * libgccjit.h: Declare gcc_jit_type_get_restrict. * libgccjit.map: Declare gcc_jit_type_get_restrict. gcc/testsuite/ChangeLog: * jit.dg/test-restrict.c: Add test for __restrict__ attribute. * jit.dg/all-non-failing-tests.h: Add test-restrict.c to the list. gcc/jit/ChangeLog: * docs/topics/compatibility.rst: Add documentation for LIBGCCJIT_ABI_25. * docs/topics/types.rst: Add documentation for gcc_jit_type_get_restrict. Signed-off-by: Guillaume Gomez <guillaume1.gomez@gmail.com>
2023-08-29RISC-V: Add Types to Un-Typed Vector InstructionsEdwin Lu3-9/+26
Updates vector instructions to ensure that no instruction is left without a type attribute. Create a placeholder type "vector" for instructions where a type isn't clear Tested for regressions using rv32/rv64 gc/gcv multilib with newlib/linux. gcc/Changelog: * config/riscv/autovec-vls.md: Update types * config/riscv/riscv.md: Add vector placeholder type * config/riscv/vector.md: Update types Reviewed-by: Jeff Law <jlaw@ventanamicro.com> Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
2023-08-29rs6000, add overloaded DFP quantize supportCarl Love5-1/+266
Add decimal floating point (DFP) quantize built-ins for both 64-bit DFP and 128-DFP operands. In each case, there is an immediate version and a variable version of the built-in. The RM value is a 2-bit constant int which specifies the rounding mode to use. For the immediate versions of the built-in, the TE field is a 5-bit constant that specifies the value of the ideal exponent for the result. The built-in specifications are: __Decimal64 builtin_dfp_quantize (_Decimal64, _Decimal64, const int RM) __Decimal64 builtin_dfp_quantize (const int TE, _Decimal64, const int RM) __Decimal128 builtin_dfp_quantize (_Decimal128, _Decimal128, const int RM) __Decimal128 builtin_dfp_quantize (const int TE, _Decimal128, const int RM) A testcase is added for the new built-in definitions. gcc/ChangeLog: * config/rs6000/dfp.md (UNSPEC_DQUAN): New unspec. (dfp_dqua_<mode>, dfp_dquai_<mode>): New define_insn. * config/rs6000/rs6000-builtins.def (__builtin_dfp_dqua, __builtin_dfp_dquai, __builtin_dfp_dquaq, __builtin_dfp_dquaqi): New buit-in definitions. * config/rs6000/rs6000-overload.def (__builtin_dfp_quantize): New overloaded definition. * doc/extend.texi: Add documentation for __builtin_dfp_quantize. gcc/testsuite/ * gcc.target/powerpc/pr93448.c: New test case. PR target/93448
2023-08-29analyzer: improve strdup handling [PR105899]David Malcolm3-9/+48
gcc/analyzer/ChangeLog: PR analyzer/105899 * kf.cc (kf_strdup::impl_call_pre): Set size of dynamically-allocated buffer. Simulate copying the string from the source region to the new buffer. gcc/testsuite/ChangeLog: PR analyzer/105899 * c-c++-common/analyzer/pr99193-2.c: Add -Wno-analyzer-too-complex. * gcc.dg/analyzer/strdup-1.c: Include "analyzer-decls.h". (test_concrete_strlen): New. (test_symbolic_strlen): New. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-08-29RISC-V: Fix one ICE for vect test vect-multitypes-5Pan Li1-0/+23
There will be one ICE when build vect-multitypes-5.c similar as below: riscv64-unknown-elf-gcc -O3 \ -march=rv64imafdcv -mabi=lp64d -mcmodel=medlow \ -fdiagnostics-plain-output -flto -ffat-lto-objects \ --param riscv-autovec-preference=scalable -Wno-psabi \ -ftree-vectorize -fno-tree-loop-distribute-patterns \ -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-details \ gcc/testsuite/gcc.dg/vect/vect-multitypes-5.c -o test.elf -lm The below RTL is not well handled in riscv_legitimize_const_move, and then fall through to the default pass. Then the default force_const_mem will NULL_RTX, and will have ICE when operating one the NULL_RTX. (const:DI (plus:DI (symbol_ref:DI ("ic") [flags 0x2] <var_decl 0x7fe57740be10 ic>) (const_poly_int:DI [16, 16]))) This patch would like to take care of this rtl in riscv_legitimize_const_move. Signed-off-by: Pan Li <pan2.li@intel.com> Co-Authored-By: Ju-Zhe Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_poly_move): New declaration. (riscv_legitimize_const_move): Handle ref plus const poly.
2023-08-29RISC-V: Add stub support for existing extensions (unprivileged)Tsukasa OI2-0/+32
After commit c283c4774d1c ("RISC-V: Throw compilation error for unknown extensions") changed how do we handle unknown extensions, we have no guarantee that we can share the same architectural string with Binutils (specifically, the assembler). To avoid compilation errors on shared Assembler-C/C++ projects or programs with inline assembler, GCC should support almost all extensions that Binutils support, even if the GCC itself does not touch a thing. This commit adds stub supported standard unprivileged extensions to riscv_ext_version_table and its implications to riscv_implied_info (all information is copied from Binutils' bfd/elfxx-riscv.c except not yet merged 'Zce', 'Zcmp' and 'Zcmt' support). gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_implied_info): Add implications from unprivileged extensions. (riscv_ext_version_table): Add stub support for all unprivileged extensions supported by Binutils as well as 'Zce', 'Zcmp', 'Zcmt'. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-31.c: New test for a stub unprivileged extension 'Zcb' with some implications.
2023-08-29RISC-V: Add stub support for existing extensions (vendor)Tsukasa OI2-0/+29
After commit c283c4774d1c ("RISC-V: Throw compilation error for unknown extensions") changed how do we handle unknown extensions, we have no guarantee that we can share the same architectural string with Binutils (specifically, the assembler). To avoid compilation errors on shared Assembler-C/C++ projects or programs with inline assembler, GCC should support almost all extensions that Binutils support, even if the GCC itself does not touch a thing. This commit adds stub supported vendor extensions to riscv_ext_version_table (no riscv_implied_info entries to add; all information is copied from Binutils' bfd/elfxx-riscv.c). gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_ext_version_table): Add stub support for all vendor extensions supported by Binutils. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-30.c: New test for a stub vendor extension 'XVentanaCondOps'.
2023-08-29RISC-V: Add stub support for existing extensions (privileged)Tsukasa OI2-0/+53
After commit c283c4774d1c ("RISC-V: Throw compilation error for unknown extensions") changed how do we handle unknown extensions, we have no guarantee that we can share the same architectural string with Binutils (specifically, the assembler). To avoid compilation errors on shared Assembler-C/C++ projects or programs with inline assembler, GCC should support almost all extensions that Binutils support, even if the GCC itself does not touch a thing. As a start, this commit adds stub supported *privileged* extensions to riscv_ext_version_table and its implications to riscv_implied_info (all information is copied from Binutils' bfd/elfxx-riscv.c). gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_implied_info): Add implications from privileged extensions. (riscv_ext_version_table): Add stub support for all privileged extensions supported by Binutils. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-29.c: New test for a stub privileged extension 'Smstateen' with some implications.
2023-08-29RISC-V: Make PR 102957 tests more comprehensiveTsukasa OI1-0/+5
Commit c283c4774d1c ("RISC-V: Throw compilation error for unknown extensions") changed how do we handle unknown extensions and commit 6f709f79c915a ("[committed] [RISC-V] Fix expected diagnostic messages in testsuite") "fixed" test failures caused by that change (on pr102957.c, by testing the error message after the first change). However, the latter change will partially break the original intent of PR 102957 test case because we wanted to make sure that we can parse a valid two-letter extension name. Fortunately, there is a valid two-letter extension name, 'Zk' (standard scalar cryptography extension superset with NIST algorithm suite). This commit adds pr102957-2.c to make sure that there will be no errors if we parse a valid two-letter extension name. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr102957-2.c: New test case using the 'Zk' extension to continue testing whether we can use valid two-letter extensions.
2023-08-29RISC-V: Refactor and clean expand_cond_len_{unop,binop,ternop}Lehua Ding3-126/+58
This patch refactors the codes of expand_cond_len_{unop,binop,ternop}. Introduces a new unified function expand_cond_len_op to do the main thing. The expand_cond_len_{unop,binop,ternop} functions only care about how to pass the operands to the intrinsic patterns. gcc/ChangeLog: * config/riscv/autovec.md: Adjust * config/riscv/riscv-protos.h (RVV_VUNDEF): Clean. (get_vlmax_rtx): Exported. * config/riscv/riscv-v.cc (emit_nonvlmax_fp_ternary_tu_insn): Deleted. (emit_vlmax_masked_gather_mu_insn): Adjust. (get_vlmax_rtx): New func. (expand_load_store): Adjust. (expand_cond_len_unop): Call expand_cond_len_op. (expand_cond_len_op): New subroutine. (expand_cond_len_binop): Call expand_cond_len_op. (expand_cond_len_ternop): Call expand_cond_len_op. (expand_lanes_load_store): Adjust.
2023-08-29tree-ssa-math-opts: Improve uaddc/usubc pattern matching [PR111209]Jakub Jelinek2-1/+196
The uaddc/usubc usual matching is of the .{ADD,SUB}_OVERFLOW pair in the middle, which adds/subtracts carry-in (from lower limbs) and computes carry-out (to higher limbs). Before optimizations (unless user writes it intentionally that way already), all the steps look the same, but optimizations simplify the handling of the least significant limb (one which adds/subtracts 0 carry-in) to just a single .{ADD,SUB}_OVERFLOW and the handling of the most significant limb if the computed carry-out is ignored to normal addition/subtraction of multiple operands. Now, match_uaddc_usubc has code to turn that least significant .{ADD,SUB}_OVERFLOW call into .U{ADD,SUB}C call with 0 carry-in if a more significant limb above it is matched into .U{ADD,SUB}C; this isn't necessary for functionality, as .ADD_OVERFLOW (x, y) is functionally equal to .UADDC (x, y, 0) (provided the types of operands are the same and result is complex type with that type element), and it also has code to match the most significant limb with ignored carry-out (in that case one pattern match turns both the penultimate limb pair of .{ADD,SUB}_OVERFLOW into .U{ADD,SUB}C and the addition/subtraction of the 4 values (2 carries) into another .U{ADD,SUB}C. As the following patch shows, what we weren't handling is the case when one uses either the __builtin_{add,sub}c builtins or hand written forms thereof (either __builtin_*_overflow or even that written by hand) for just 2 limbs, where the least significant has 0 carry-in and the most significant ignores carry-out. The following patch matches that, e.g. _16 = .ADD_OVERFLOW (_1, _2); _17 = REALPART_EXPR <_16>; _18 = IMAGPART_EXPR <_16>; _15 = _3 + _4; _12 = _15 + _18; into _16 = .UADDC (_1, _2, 0); _17 = REALPART_EXPR <_16>; _18 = IMAGPART_EXPR <_16>; _19 = .UADDC (_3, _4, _18); _12 = IMAGPART_EXPR <_19>; so that we can emit better code. As the 2 later comments show, we must do that carefully, because the pass walks the IL from first to last stmt in a bb and we must avoid pattern matching this way something that should be matched on a later instruction differently. 2023-08-29 Jakub Jelinek <jakub@redhat.com> PR middle-end/79173 PR middle-end/111209 * tree-ssa-math-opts.cc (match_uaddc_usubc): Match also just 2 limb uaddc/usubc with 0 carry-in on lower limb and ignored carry-out on higher limb. Don't match it though if it could be matched later on 4 argument addition/subtraction. * gcc.target/i386/pr79173-12.c: New test.
2023-08-29MATCH: Move `(x | y) & (~x ^ y)` over to use bitwise_inverted_equal_pAndrew Pinski2-2/+52
This moves the match pattern `(x | y) & (~x ^ y)` over to use bitwise_inverted_equal_p. This now also allows to optmize comparisons and also catches the missed `(~x | y) & (x ^ y)` transformation into `~x & y`. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/111147 * match.pd (`(x | y) & (~x ^ y)`) Use bitwise_inverted_equal_p instead of matching bit_not. gcc/testsuite/ChangeLog: PR tree-optimization/111147 * gcc.dg/tree-ssa/cmpbit-4.c: New test.
2023-08-29vect test: Remove xfail for riscvJuzhe-Zhong1-1/+1
We are planning to enable "vect" testsuite with scalable vector auto-vectorization. This case XPASS: XPASS: gcc.dg/vect/no-scevccp-outer-12.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED." 1 like ARM SVE. gcc/testsuite/ChangeLog: * gcc.dg/vect/no-scevccp-outer-12.c: Add riscv xfail.
2023-08-29arm: Fix bootstrap / add missing initializer in MVE type_suffixesChristophe Lyon1-1/+1
My recent patch r14-3519-g9bae37ec8dc320 (arm: [MVE intrinsics] add support for p8 and p16 polynomial types) added a new member to type_suffix_info, but I forgot to add the corresponding initializer to type_suffixes. Committed as obvious. 2023-08-29 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins.cc (type_suffixes): Add missing initializer.
2023-08-29RISC-V: Fix ASM check of vlmax_switch_vtype-16.cJuzhe-Zhong1-1/+1
Notice there is a failure: FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-16.c -O2 scan-assembler-times vsetvli\\s+zero,\\s*zero 2 Fix "2" into "3", the assembly is correct and better. Committed. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-16.c: Fix ASM check.
2023-08-29RISC-V: Fix AVL/VL get ICE[VSETVL PASS]Juzhe-Zhong2-16/+31
Fix bunch of ICE in "vect" testsuite: FAIL: gcc.dg/vect/vect-alias-check-16.c (internal compiler error: Segmentation fault) FAIL: gcc.dg/vect/vect-alias-check-16.c (test for excess errors) FAIL: gcc.dg/vect/vect-alias-check-16.c -flto -ffat-lto-objects (internal compiler error: Segmentation fault) FAIL: gcc.dg/vect/vect-alias-check-16.c -flto -ffat-lto-objects (test for excess errors) FAIL: gcc.dg/vect/vect-alias-check-20.c (internal compiler error: Segmentation fault) FAIL: gcc.dg/vect/vect-alias-check-20.c (test for excess errors) FAIL: gcc.dg/vect/vect-alias-check-20.c -flto -ffat-lto-objects (internal compiler error: Segmentation fault) FAIL: gcc.dg/vect/vect-alias-check-20.c -flto -ffat-lto-objects (test for excess errors) gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (vector_insn_info::get_avl_or_vl_reg): New function. (pass_vsetvl::compute_local_properties): Fix bug. (pass_vsetvl::commit_vsetvls): Ditto. * config/riscv/riscv-vsetvl.h: New function.
2023-08-29RISC-V: Fix error combine of pred_mov patternLehua Ding5-49/+106
This patch fix PR110943 which will produce some error code. This is because the error combine of some pred_mov pattern. Consider this code: ``` void foo9 (void *base, void *out, size_t vl) { int64_t scalar = *(int64_t*)(base + 100); vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1); *(vint64m2_t*)out = v; } ``` RTL before combine pass: ``` (insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ]) (if_then_else:RVVM2DI (unspec:RVVMF32BI [ (const_vector:RVVMF32BI repeat [ (const_int 1 [0x1]) ]) (const_int 1 [0x1]) (const_int 2 [0x2]) repeated x2 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (const_vector:RVVM2DI repeat [ (const_int 0 [0]) ]) (unspec:RVVM2DI [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di}) (insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128]) (reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 {*movrvvm2di_whole}) ``` RTL after combine pass: ``` (insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128]) (if_then_else:RVVM2DI (unspec:RVVMF32BI [ (const_vector:RVVMF32BI repeat [ (const_int 1 [0x1]) ]) (const_int 1 [0x1]) (const_int 2 [0x2]) repeated x2 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (const_vector:RVVM2DI repeat [ (const_int 0 [0]) ]) (unspec:RVVM2DI [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di}) ``` This combine change the semantics of insn 14. I split @pred_mov pattern and restrict the conditon of @pred_mov. PR target/110943 gcc/ChangeLog: * config/riscv/predicates.md (vector_const_int_or_double_0_operand): New predicate. * config/riscv/riscv-vector-builtins.cc (function_expander::function_expander): force_reg mem target operand. * config/riscv/vector.md (@pred_mov<mode>): Wrapper. (*pred_mov<mode>): Remove imm -> reg pattern. (*pred_broadcast<mode>_imm): Add imm -> reg pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Adjust. * gcc.target/riscv/rvv/base/pr110943.c: New test.
2023-08-29LoongArch: Enable '-free' starting at -O2.Lulu Cheng3-2/+28
gcc/ChangeLog: * common/config/loongarch/loongarch-common.cc: Enable '-free' on O2 and above. * doc/invoke.texi: Modify the description information of the '-free' compilation option and add the LoongArch description. gcc/testsuite/ChangeLog: * gcc.target/loongarch/sign-extend.c: New test.
2023-08-29Daily bump.GCC Administrator3-1/+407
2023-08-28RISC-V: Fix documentation of __builtin_riscv_pauseTsukasa OI1-3/+3
This built-in does not imply the 'Xgnuzihintpausestate' extension. It does not change architectural state (because all HINTs are prohibited from doing that). gcc/ChangeLog: * doc/extend.texi: Fix the description of __builtin_riscv_pause.
2023-08-28RISC-V: __builtin_riscv_pause for all environmentTsukasa OI8-13/+41
The "pause" RISC-V hint instruction requires the 'Zihintpause' extension (in the assembler). However, GCC emits "pause" unconditionally, making an assembler error while compiling code with __builtin_riscv_pause while the 'Zihintpause' extension disabled. However, the "pause" instruction code (0x0100000f) is a HINT and emitting its instruction code is safe in any environment. This commit implements handling for the 'Zihintpause' extension and emits ".insn 0x0100000f" instead of "pause" only if the extension is disabled (making the diagnostics better). gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_ext_version_table): Implement the 'Zihintpause' extension, version 2.0. (riscv_ext_flag_table) Add 'Zihintpause' handling. * config/riscv/riscv-builtins.cc: Remove availability predicate "always" and add "hint_pause". (riscv_builtins) : Add "pause" extension. * config/riscv/riscv-opts.h (MASK_ZIHINTPAUSE, TARGET_ZIHINTPAUSE): New. * config/riscv/riscv.md (riscv_pause): Adjust output based on TARGET_ZIHINTPAUSE. gcc/testsuite/ChangeLog: * gcc.target/riscv/builtin_pause.c: Removed. * gcc.target/riscv/zihintpause-1.c: New test when the 'Zihintpause' extension is enabled. * gcc.target/riscv/zihintpause-2.c: Likewise. * gcc.target/riscv/zihintpause-noarch.c: New test when the 'Zihintpause' extension is disabled.
2023-08-28Fix cond-bool-2.c on powerpc and other targetsAndrew Pinski1-1/+1
This adds `--param logical-op-non-short-circuit=1` to the tescase so it becomes a target indepdendent testcase now. I filed PR 111217 as the variant of the testcase which fails indepdendently of the param. Committed as obvious after testing to make sure it passes on powerpc now. gcc/testsuite/ChangeLog: PR testsuite/111215 * gcc.dg/tree-ssa/cond-bool-2.c: Add `--param logical-op-non-short-circuit=1` to the options.