aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-12-12MAINTAINERS: Add myself to write after approval and DCOFeng Wang1-0/+1
ChangeLog: * MAINTAINERS: Add myself to write after approval
2023-12-12Daily bump.GCC Administrator13-1/+596
2023-12-11testsuite: Disable -fstack-protector* for some strub testsJakub Jelinek4-4/+4
In our distro builds, we test with RUNTESTFLAGS='--target_board=unix\{,-fstack-protector-strong\}' because SSP is something we use widely in the distribution. 4 new strub test FAIL with that option though, as can be seen with a simple make check-gcc check-g++ RUNTESTFLAGS='--target_board=unix\{,-fstack-protector-strong\} dg.exp=strub-O*' - in particular, the expand dump \[(\]call\[^\n\]*strub_leave.*\n\[(\]code_label regexps see code_labels in there introduced for stack protector. The following patch fixes it by using -fno-stack-protector for these explicitly. 2023-12-11 Jakub Jelinek <jakub@redhat.com> * c-c++-common/strub-O2fni.c: Add -fno-stack-protector to dg-options. * c-c++-common/strub-O3fni.c: Likewise. * c-c++-common/strub-Os.c: Likewise. * c-c++-common/strub-Og.c: Likewise.
2023-12-11Fix regression causing ICE for structs with VLAs [PR 112488]Martin Uecker7-11/+65
A previous patch that fixed several ICEs related to size expressions of VM types (PR c/70418, ...) caused a regression for structs where a DECL_EXPR is not generated anymore although reqired. We now call add_decl_expr introduced by the previous patch from finish_struct. The function is revised with a new argument to not set the TYPE_NAME for the type to the DECL_EXPR in this specific case. PR c/112488 gcc/c * c-decl.cc (add_decl_expr): Revise. (finish_struct): Create DECL_EXPR. * c-parser.cc (c_parser_struct_or_union_specifier): Call finish_struct with expression for VLA sizes. * c-tree.h (finish_struct): Add argument. gcc/testsuite * gcc.dg/pr112488-1.c: New test. * gcc.dg/pr112488-2.c: New test. * gcc.dg/pr112898.c: New test. * gcc.misc-tests/gcov-pr85350.c: Adapt.
2023-12-11Resolve ICE in 'gcc/fortran/trans-openmp.cc:gfc_omp_call_is_alloc'Thomas Schwinge1-1/+1
Fix-up for recent commit 2505a8b41d3b74a545755a278f3750a29c1340b6 "OpenMP: Minor '!$omp allocators' cleanup", which caused: {+FAIL: gfortran.dg/gomp/allocate-5.f90 -O (internal compiler error: tree check: expected class 'type', have 'declaration' (function_decl) in gfc_omp_call_is_alloc, at fortran/trans-openmp.cc:8386)+} [-PASS:-]{+FAIL:+} gfortran.dg/gomp/allocate-5.f90 -O (test for excess errors) ..., and similarly in 'libgomp.fortran/allocators-1.f90', 'libgomp.fortran/allocators-2.f90', 'libgomp.fortran/allocators-3.f90', 'libgomp.fortran/allocators-4.f90', 'libgomp.fortran/allocators-5.f90'. gcc/fortran/ * trans-openmp.cc (gfc_omp_call_is_alloc): Resolve ICE.
2023-12-11analyzer: fix uninitialized bitmap [PR112955]David Malcolm1-0/+1
In r14-5566-g841008d3966c0f I added a new ctor for feasibility_state, but failed to call bitmap_clear on m_snodes_visited. Fixed thusly. gcc/analyzer/ChangeLog: PR analyzer/112955 * engine.cc (feasibility_state::feasibility_state): Initialize m_snodes_visited. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-12-11Treat "p" in asms as addressing VOIDmodeRichard Sandiford3-8/+15
check_asm_operands was inconsistent about how it handled "p" after RA compared to before RA. Before RA it tested the address with a void (unknown) memory mode: case CT_ADDRESS: /* Every address operand can be reloaded to fit. */ result = result || address_operand (op, VOIDmode); break; After RA it deferred to constrain_operands, which used the mode of the operand: if ((GET_MODE (op) == VOIDmode || SCALAR_INT_MODE_P (GET_MODE (op))) && (strict <= 0 || (strict_memory_address_p (recog_data.operand_mode[opno], op)))) win = true; Using the mode of the operand is necessary for special predicates, where it is used to give the memory mode. But for asms, the operand mode is simply the mode of the address itself (so DImode on 64-bit targets), which doesn't say anything about the addressed memory. This patch uses VOIDmode for asms but continues to use the operand mode for .md insns. It's needed to avoid a regression in the testcase with the late-combine pass. Fixing this made me realise that recog_level2 was doing duplicate work for asms after RA. gcc/ * recog.cc (constrain_operands): Pass VOIDmode to strict_memory_address_p for 'p' constraints in asms. * rtl-ssa/changes.cc (recog_level2): Skip redundant constrain_operands for asms. gcc/testsuite/ * gcc.target/aarch64/prfm_imm_offset_2.c: New test.
2023-12-11testsuite: update manglingJason Merrill3-0/+26
Since r14-6064-gc3f281a0c1ca50 this test was checking for the wrong mangling, but it still passed on targets that support ABI compatibility aliases. Let's avoid generating those aliases when checking mangling. gcc/ChangeLog: * common.opt: Add comment. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-explicit-inst1.C: Specify ABI v18. * g++.dg/cpp2a/concepts-explicit-inst1a.C: New test.
2023-12-11-finline-stringops: avoid too-wide smallest_int_mode_for_size [PR112784]Alexandre Oliva2-11/+21
smallest_int_mode_for_size may abort when the requested mode is not available. Call int_mode_for_size instead, that signals the unsatisfiable request in a more graceful way. for gcc/ChangeLog PR middle-end/112784 * expr.cc (emit_block_move_via_loop): Call int_mode_for_size for maybe-too-wide sizes. (emit_block_cmp_via_loop): Likewise. for gcc/testsuite/ChangeLog PR middle-end/112784 * gcc.target/i386/avx512cd-inline-stringops-pr112784.c: New.
2023-12-11-finline-stringops: check base blksize for memset [PR112778]Alexandre Oliva2-9/+58
The recently-added logic for -finline-stringops=memset introduced an assumption that doesn't necessarily hold, namely, that can_store_by_pieces of a larger size implies can_store_by_pieces by smaller sizes. Checks for all sizes the by-multiple-pieces machinery might use before committing to an expansion pattern. for gcc/ChangeLog PR target/112778 * builtins.cc (can_store_by_multiple_pieces): New. (try_store_by_multiple_pieces): Call it. for gcc/testsuite/ChangeLog PR target/112778 * gcc.dg/inline-mem-cmp-pr112778.c: New.
2023-12-11-finline-stringops: don't assume ptr_mode ptr in memset [PR112804]Alexandre Oliva2-1/+8
On aarch64 -milp32, and presumably on other such targets, ptr can be in a different mode than ptr_mode in the testcase. Cope with it. for gcc/ChangeLog PR target/112804 * builtins.cc (try_store_by_multiple_pieces): Use ptr's mode for the increment. for gcc/testsuite/ChangeLog PR target/112804 * gcc.target/aarch64/inline-mem-set-pr112804.c: New.
2023-12-11multiflags: fix doc warningAlexandre Oliva1-1/+1
Comply with dubious doc warning that after an @xref there must be a comma or a period, not a close parentheses. for gcc/ChangeLog * doc/invoke.texi (multiflags): Add period after @xref to silence warning.
2023-12-11strub: disable on rl78Alexandre Oliva1-0/+5
rl78 allocation of virtual registers to physical registers doesn't operate on asm statements, and strub uses asm statements in the runtime and in the generated code, to the point that the runtime won't build. Force strub disabled on that target. for gcc/ChangeLog * config/rl78/rl78.cc (TARGET_HAVE_STRUB_SUPPORT_FOR): Disable.
2023-12-11strub: add note on attribute accessAlexandre Oliva1-1/+10
Document why attribute access doesn't need the same treatment as fn spec, and check that the assumption behind it holds. for gcc/ChangeLog * ipa-strub.cc (pass_ipa_strub::execute): Check that we don't add indirection to pointer parameters, and document attribute access non-interactions.
2023-12-11libgfortran: Replace mutex with rwlockLipeng Zhu10-58/+386
This patch try to introduce the rwlock and split the read/write to unit_root tree and unit_cache with rwlock instead of the mutex to increase CPU efficiency. In the get_gfc_unit function, the percentage to step into the insert_unit function is around 30%, in most instances, we can get the unit in the phase of reading the unit_cache or unit_root tree. So split the read/write phase by rwlock would be an approach to make it more parallel. BTW, the IPC metrics can gain around 9x in our test server with 220 cores. The benchmark we used is https://github.com/rwesson/NEAT libgcc/ChangeLog: * gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro. (__gthrw): New function. (__gthread_rwlock_rdlock): New function. (__gthread_rwlock_tryrdlock): New function. (__gthread_rwlock_wrlock): New function. (__gthread_rwlock_trywrlock): New function. (__gthread_rwlock_unlock): New function. libgfortran/ChangeLog: * io/async.c (DEBUG_LINE): New macro. * io/async.h (RWLOCK_DEBUG_ADD): New macro. (CHECK_RDLOCK): New macro. (CHECK_WRLOCK): New macro. (TAIL_RWLOCK_DEBUG_QUEUE): New macro. (IN_RWLOCK_DEBUG_QUEUE): New macro. (RDLOCK): New macro. (WRLOCK): New macro. (RWUNLOCK): New macro. (RD_TO_WRLOCK): New macro. (INTERN_RDLOCK): New macro. (INTERN_WRLOCK): New macro. (INTERN_RWUNLOCK): New macro. * io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in a comment. (unit_lock): Remove including associated internal_proto. (unit_rwlock): New declarations including associated internal_proto. (dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock instead of __gthread_mutex_lock and __gthread_mutex_unlock on unit_lock. * io/transfer.c (st_read_done_worker): Use WRLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock. (st_write_done_worker): Likewise. * io/unit.c: Change UNIT_LOCK to UNIT_RWLOCK in 'IO locking rules' comment. Use unit_rwlock variable instead of unit_lock variable. (get_gfc_unit_from_unit_root): New function. (get_gfc_unit): Use RDLOCK, WRLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock. (close_unit_1): Use WRLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock. (close_units): Likewise. (newunit_alloc): Use RWUNLOCK on unit_rwlock instead of UNLOCK on unit_lock. * io/unix.c (find_file): Use RDLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock. (flush_all_units): Use WRLOCK and RWUNLOCK on unit_rwlock instead of LOCK and UNLOCK on unit_lock.
2023-12-11PR rtl-optimization/112380: Defend against CLOBBERs in combine.ccRoger Sayle2-3/+39
This patch addresses PR rtl-optimization/112380, an ICE-on-valid regression where a (clobber (const_int 0)) encounters a sanity checking gcc_assert (at line 7554) in simplify-rtx.cc. These CLOBBERs are used internally by GCC's combine pass much like error_mark_node is used by various language front-ends. The solutions are either to handle/accept these CLOBBERs through-out (or in more places in) the middle-end's RTL optimizers, including functions in simplify-rtx.cc that are used by passes other than combine, and/or attempt to prevent these CLOBBERs escaping from try_combine into the RTX/RTL stream. The benefit of the second approach is that it actually allows for better optimization: when try_combine fails to simplify an expression instead of substituting a CLOBBER to avoid the instruction pattern being recognized, noticing the CLOBBER often allows combine to attempt alternate simplifications/transformations looking for those that can be recognized. This first alternative is the minimal fix to address the CLOBBER encountered in the bugzilla PR. 2023-12-11 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR rtl-optimization/112380 * combine.cc (expand_field_assignment): Check if gen_lowpart returned a CLOBBER, and avoid calling gen_simplify_binary with it if so. gcc/testsuite/ChangeLog PR rtl-optimization/112380 * gcc.dg/pr112380.c: New test case.
2023-12-11Testsuite: restrict test to nonpic targetsFrancois-Xavier Coudert1-0/+1
The test is currently failing on x86_64-apple-darwin. gcc/testsuite/ChangeLog: PR testsuite/112297 * gcc.target/i386/pr100936.c: Require nonpic target.
2023-12-11c++: add fixed testcase [PR63378]Patrick Palka1-0/+20
We accept this testcase since r12-4453-g79802c5dcc043a. PR c++/63378 gcc/testsuite/ChangeLog: * g++.dg/template/fnspec3.C: New test.
2023-12-11aarch64: Fix wrong code for bfloat when f16 is enabled [PR 111867]Andrew Pinski1-0/+4
The problem here is when f16 is enabled, movbf_aarch64 accepts `Ufc` as a constraint: [ w , Ufc ; fconsts , fp16 ] fmov\t%h0, %1 But that is for fmov values and in this case fmov represents f16 rather than bfloat16 values. This means we would get the wrong value in the register. Built and tested for aarch64-linux-gnu with no regressions. Also tested with `-march=armv9-a+sve2, gcc.dg/torture/bfloat16-basic.c and gcc.dg/torture/bfloat16-builtin.c no longer fail. gcc/ChangeLog: PR target/111867 * config/aarch64/aarch64.cc (aarch64_float_const_representable_p): For BFmode, only accept +0.0. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2023-12-11MATCH: (convert)(zero_one !=/== 0/1) for outer type and zero_one type are ↵Andrew Pinski9-63/+103
the same When I moved two_value to match.pd, I removed the check for the {0,+-1} as I had placed it after the {0,+-1} case for cond in match.pd. In the case of {0,+-1} and non boolean, before we would optmize those case to just `(convert)a` but after we would get `(convert)(a != 0)` which was not handled anyways to just `(convert)a`. So this adds a pattern to match `(convert)(zeroone != 0)` and simplify to `(convert)zeroone`. Also this optimizes (convert)(zeroone == 0) into (zeroone^1) if the type match. Removing the opposite transformation from fold. The opposite transformation was added with https://gcc.gnu.org/pipermail/gcc-patches/2006-February/190514.html It is no longer considered the canonicalization either, even VRP will transform it back into `(~a) & 1` so removing it is a good idea. Note the testcase pr69270.c needed a slight update due to not matching exactly a scan pattern, this update makes it more robust and will match before and afterwards and if there are other changes in this area too. Note the testcase gcc.target/i386/pr110790-2.c needs a slight update for better code generation in LP64 bit mode. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/111972 PR tree-optimization/110637 * match.pd (`(convert)(zeroone !=/== CST)`): Match and simplify to ((convert)zeroone){,^1}. * fold-const.cc (fold_binary_loc): Remove transformation of `(~a) & 1` and `(a ^ 1) & 1` into `(convert)(a == 0)`. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr110637-1.c: New test. * gcc.dg/tree-ssa/pr110637-2.c: New test. * gcc.dg/tree-ssa/pr110637-3.c: New test. * gcc.dg/tree-ssa/pr111972-1.c: New test. * gcc.dg/tree-ssa/pr69270.c: Update testcase. * gcc.target/i386/pr110790-2.c: Update testcase. * gcc.dg/fold-even-1.c: Removed. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2023-12-11analyzer: Remove check of unsigned_char in ↵Andrew Pinski1-3/+0
maybe_undo_optimize_bit_field_compare. The check for the type seems unnecessary and gets in the way sometimes. Also with a patch I am working on for match.pd, it causes a failure to happen. Before my patch the IR was: _1 = BIT_FIELD_REF <s, 8, 16>; _2 = _1 & 1; _3 = _2 != 0; _4 = (int) _3; __analyzer_eval (_4); Where _2 was an unsigned char type. And After my patch we have: _1 = BIT_FIELD_REF <s, 8, 16>; _2 = (int) _1; _3 = _2 & 1; __analyzer_eval (_3); But in this case, the BIT_AND_EXPR is in an int type. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/analyzer/ChangeLog: * region-model-manager.cc (maybe_undo_optimize_bit_field_compare): Remove the check for type being unsigned_char_type_node.
2023-12-11expr: catch more `a*bool` while expanding [PR 112935]Andrew Pinski1-2/+3
After r14-1655-g52c92fb3f40050 (and the other commits which touch zero_one_valued_p), we end up with a with `bool * a` but where the bool is an SSA name that might not have non-zero bits set on it (to 0x1) even though it does the non-zero bits would be 0x1. The case of coremarks, it is only phiopt4 which adds the new ssa name and nothing afterwards updates the nonzero bits on it. This fixes the regression by using gimple_zero_one_valued_p rather than tree_nonzero_bits to match the cases where the SSA_NAME didn't have the non-zero bits set. gimple_zero_one_valued_p handles one level of cast and also and an `&`. Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: PR middle-end/112935 * expr.cc (expand_expr_real_2): Use gimple_zero_one_valued_p instead of tree_nonzero_bits to find boolean defined expressions. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2023-12-11[PATCH] wrong code on m68k with -mlong-jump-table-offsets and -malign-int ↵Mikael Pettersson3-6/+6
(PR target/112413) On m68k the compiler assumes that the PC-relative jump-via-jump-table instruction and the jump table are adjacent with no padding in between. When -mlong-jump-table-offsets is combined with -malign-int, a 2-byte nop may be inserted before the jump table, causing the jump to add the fetched offset to the wrong PC base and thus jump to the wrong address. Fixed by referencing the jump table via its label. On the test case in the PR the object code change is (the moveal at 16 is the nop): a: 6536 bcss 42 <f+0x42> c: e588 lsll #2,%d0 e: 203b 0808 movel %pc@(18 <f+0x18>,%d0:l),%d0 - 12: 4efb 0802 jmp %pc@(16 <f+0x16>,%d0:l) + 12: 4efb 0804 jmp %pc@(18 <f+0x18>,%d0:l) 16: 284c moveal %a4,%a4 18: 0000 0020 orib #32,%d0 1c: 0000 002c orib #44,%d0 Bootstrapped and tested on m68k-linux-gnu, no regressions. Note: I don't have commit rights to I would need assistance applying this. PR target/112413 gcc/ * config/m68k/linux.h (ASM_RETURN_CASE_JUMP): For TARGET_LONG_JUMP_TABLE_OFFSETS, reference the jump table via its label. * config/m68k/m68kelf.h (ASM_RETURN_CASE_JUMP): Likewise. * config/m68k/netbsd-elf.h (ASM_RETURN_CASE_JUMP): Likewise.
2023-12-11aarch64: enable mixed-types for aarch64 simdclonesAndre Vieira37-116/+659
This patch enables the use of mixed-types for simd clones for AArch64, adds aarch64 as a target_vect_simd_clones and corrects the way the simdlen is chosen for non-specified simdlen clauses according to the 'Vector Function Application Binary Interface Specification for AArch64'. Additionally this patch also restricts combinations of simdlen and return/argument types that map to vectors larger than 128 bits as we currently do not have a way to represent these types in a way that is consistent internally and externally. gcc/ChangeLog: * config/aarch64/aarch64.cc (lane_size): New function. (aarch64_simd_clone_compute_vecsize_and_simdlen): Determine simdlen according to NDS rule and reject combination of simdlen and types that lead to vectors larger than 128bits. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add aarch64 targets to vect_simd_clones. * c-c++-common/gomp/declare-variant-14.c: Adapt test for aarch64. * c-c++-common/gomp/pr60823-1.c: Likewise. * c-c++-common/gomp/pr60823-2.c: Likewise. * c-c++-common/gomp/pr60823-3.c: Likewise. * g++.dg/gomp/attrs-10.C: Likewise. * g++.dg/gomp/declare-simd-1.C: Likewise. * g++.dg/gomp/declare-simd-3.C: Likewise. * g++.dg/gomp/declare-simd-4.C: Likewise. * g++.dg/gomp/declare-simd-7.C: Likewise. * g++.dg/gomp/declare-simd-8.C: Likewise. * g++.dg/gomp/pr88182.C: Likewise. * gcc.dg/declare-simd.c: Likewise. * gcc.dg/gomp/declare-simd-1.c: Likewise. * gcc.dg/gomp/declare-simd-3.c: Likewise. * gcc.dg/gomp/pr87887-1.c: Likewise. * gcc.dg/gomp/pr87895-1.c: Likewise. * gcc.dg/gomp/pr89246-1.c: Likewise. * gcc.dg/gomp/pr99542.c: Likewise. * gcc.dg/gomp/simd-clones-2.c: Likewise. * gcc.dg/vect/vect-simd-clone-1.c: Likewise. * gcc.dg/vect/vect-simd-clone-2.c: Likewise. * gcc.dg/vect/vect-simd-clone-4.c: Likewise. * gcc.dg/vect/vect-simd-clone-5.c: Likewise. * gcc.dg/vect/vect-simd-clone-6.c: Likewise. * gcc.dg/vect/vect-simd-clone-7.c: Likewise. * gcc.dg/vect/vect-simd-clone-8.c: Likewise. * gfortran.dg/gomp/declare-simd-2.f90: Likewise. * gfortran.dg/gomp/declare-simd-coarray-lib.f90: Likewise. * gfortran.dg/gomp/declare-variant-14.f90: Likewise. * gfortran.dg/gomp/pr79154-1.f90: Likewise. * gfortran.dg/gomp/pr83977.f90: Likewise. libgomp/ChangeLog: * testsuite/libgomp.c/declare-variant-1.c: Adapt test for aarch64. * testsuite/libgomp.fortran/declare-simd-1.f90: Likewise.
2023-12-11c++: alias CTAD and specializations tablePatrick Palka4-1/+29
A rewritten guide for alias CTAD isn't really a specialization of the original guide, so we shouldn't register it as such. This avoids an ICE in the below modules testcase for which we otherwise crash due to the guide's empty DECL_CONTEXT when walking the specializations table. It also preemptively avoids the same ICE in modules/concept-6 in C++23 mode with the inherited CTAD patch. gcc/cp/ChangeLog: * pt.cc (alias_ctad_tweaks): Pass use_spec_table=false to tsubst_decl. gcc/testsuite/ChangeLog: * g++.dg/modules/concept-8.h: New test. * g++.dg/modules/concept-8_a.H: New test. * g++.dg/modules/concept-8_b.C: New test.
2023-12-11RISC-V: testsuite: Fix strcmp-run.c test.Robin Dapp3-14/+15
This fixes expectations in the strcmp-run test which would sometimes fail with newlib. The test expects libc strcmp return values and asserts the vectorized result is similar to those. Therefore hard-code the expected results instead of relying on a strcmp call. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c: Adjust test expectation and target selector. * gcc.target/riscv/rvv/autovec/builtin/strlen-run.c: Adjust target selector. * gcc.target/riscv/rvv/autovec/builtin/strncmp-run.c: Ditto.
2023-12-11OpenMP: Support acquires/release in 'omp require atomic_default_mem_order'Tobias Burnus16-32/+291
This is an OpenMP 5.2 feature. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_requires): Handle acquires/release in atomic_default_mem_order clause. (c_parser_omp_atomic): Update. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_requires): Handle acquires/release in atomic_default_mem_order clause. (cp_parser_omp_atomic): Update. gcc/fortran/ChangeLog: * gfortran.h (enum gfc_omp_requires_kind): Add OMP_REQ_ATOMIC_MEM_ORDER_ACQUIRE and OMP_REQ_ATOMIC_MEM_ORDER_RELEASE. (gfc_namespace): Add a 7th bit to omp_requires. * module.cc (enum ab_attribute): Add AB_OMP_REQ_MEM_ORDER_ACQUIRE and AB_OMP_REQ_MEM_ORDER_RELEASE (mio_symbol_attribute): Handle it. * openmp.cc (gfc_omp_requires_add_clause): Update for acquire/release. (gfc_match_omp_requires): Likewise. (gfc_match_omp_atomic): Handle them for atomic_default_mem_order. * parse.cc: Likewise. gcc/testsuite/ChangeLog: * c-c++-common/gomp/requires-3.c: Update for now valid code. * gfortran.dg/gomp/requires-3.f90: Likewise. * gfortran.dg/gomp/requires-2.f90: Update dg-error. * gfortran.dg/gomp/requires-5.f90: Likewise. * c-c++-common/gomp/requires-5.c: New test. * c-c++-common/gomp/requires-6.c: New test. * c-c++-common/gomp/requires-7.c: New test. * c-c++-common/gomp/requires-8.c: New test. * gfortran.dg/gomp/requires-10.f90: New test. * gfortran.dg/gomp/requires-11.f90: New test.
2023-12-11OpenMP: Minor '!$omp allocators' cleanupTobias Burnus2-2/+9
gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): Set 'fn spec'. libgomp/ChangeLog: * libgomp_g.h (GOMP_add_alloc, GOMP_is_alloc): Add.
2023-12-11ada: Fix Ada bootstrap on FreeBSDRainer Orth1-1/+6
Ada bootstrap on FreeBSD/amd64 was also broken by the recent warning changes: terminals.c: In function 'allocate_pty_desc': terminals.c:1200:12: error: implicit declaration of function 'openpty'; did you mean 'openat'? [-Wimplicit-function-declaration] 1200 | status = openpty (&master_fd, &slave_fd, NULL, NULL, NULL); | ^~~~~~~ | openat terminals.c: At top level: terminals.c:1268:9: warning: "TABDLY" redefined 1268 | #define TABDLY 0 | ^~~~~~ In file included from /usr/include/termios.h:38, from terminals.c:1109: /usr/include/sys/_termios.h:111:9: note: this is the location of the previous definition 111 | #define TABDLY 0x00000004 /* tab delay mask */ | ^~~~~~ make[7]: *** [../gcc-interface/Makefile:302: terminals.o] Error 1 Fixed by including the necessary header and guarding the fallback definition of TABDLY. This allowed a 64-bit-only bootstrap on x86_64-unknown-freebsd14.0 to complete successfully. 2023-12-11 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/ada: * terminals.c [__FreeBSD__]: Include <libutil.h>. (TABDLY): Only define if missing.
2023-12-11RTL-SSA: Fix ICE on record_use of RTL_SSA for RISC-V VSETVL PASSJuzhe-Zhong2-3/+29
This patch fixes an ICE on record_use during RTL_SSA initialization RISC-V backend VSETVL PASS. This is the ICE: 0x11a8603 partial_subreg_p(machine_mode, machine_mode) ../../../../gcc/gcc/rtl.h:3187 0x3b695eb rtl_ssa::function_info::record_use(rtl_ssa::function_info::build_info&, rtl_ssa::insn_info*, rtx_obj_reference) ../../../../gcc/gcc/rtl-ssa/insns.cc:524 In record_use: if (HARD_REGISTER_NUM_P (regno) && partial_subreg_p (use->mode (), mode)) Assertion failed on partial_subreg_p which is: inline bool partial_subreg_p (machine_mode outermode, machine_mode innermode) { /* Modes involved in a subreg must be ordered. In particular, we must always know at compile time whether the subreg is paradoxical. */ poly_int64 outer_prec = GET_MODE_PRECISION (outermode); poly_int64 inner_prec = GET_MODE_PRECISION (innermode); gcc_checking_assert (ordered_p (outer_prec, inner_prec)); -----> cause ICE. return maybe_lt (outer_prec, inner_prec); } RISC-V VSETVL PASS is an advanced lazy vsetvl insertion PASS after RA (register allocation). The rootcause is that we have a pattern (reduction instruction) that includes both VLA (length-agnostic) and VLS (fixed-length) modes. (insn 168 173 170 31 (set (reg:RVVM1SI 101 v5 [311]) (unspec:RVVM1SI [ (unspec:V32BI [ (const_vector:V32BI [ (const_int 1 [0x1]) repeated x32 ]) (reg:DI 30 t5 [312]) (const_int 2 [0x2]) repeated x2 (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (unspec:RVVM1SI [ (reg:V32SI 96 v0 [orig:185 vect__96.40 ] [185]) -----> VLS mode NUNITS = 32 elements. (reg:RVVM1SI 113 v17 [439]) -----> VLA mode NUNITS = [8, 8] elements. ] UNSPEC_REDUC_XOR) (unspec:RVVM1SI [ (reg:SI 0 zero) ] UNSPEC_VUNDEF) ] UNSPEC_REDUC)) 15948 {pred_redxorv32si} In this case, record_use is trying to check partial_subreg_p (use->mode (), mode) for RTX = (reg:V32SI 96 v0 [orig:185 vect__96.40 ] [185]). use->mode () == V32SImode, wheras mode = RVVM1SImode. Then it ICE since they are !ordered_p. Set the use mode as the biggest mode which is natural fall back mode. gcc/ChangeLog: * rtl-ssa/insns.cc (function_info::record_use): Add !ordered_p case. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/vsetvl_bug-2.c: New test.
2023-12-11RISC-V: Robostify shuffle index used by vrgather and fix regressionJuzhe-Zhong2-33/+49
Notice there are some regression FAILs: FAIL: gcc.target/riscv/rvv/autovec/pr110950.c -O3 -ftree-vectorize scan-assembler-times vslide1up\\.vx 1 FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c -std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax scan-assembler-times vrgather\\.vv\\tv[0-9]+,\\s*v[0-9]+,\\s*v[0-9]+ 19 FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c -std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax scan-assembler-times vrgatherei16\\.vv\\tv[0-9]+,\\s*v[0-9]+,\\s*v[0-9]+ 12 FAIL: gcc.target/riscv/rvv/autovec/vls/perm-4.c -O3 -ftree-vectorize --param riscv-autovec-preference=scalable scan-assembler-times vrgather\\.vv\\tv[0-9]+,\\s*v[0-9]+,\\s*v[0-9]+ 19 FAIL: gcc.target/riscv/rvv/autovec/vls/perm-4.c -O3 -ftree-vectorize --param riscv-autovec-preference=scalable scan-assembler-times vrgatherei16\\.vv\\tv[0-9]+,\\s*v[0-9]+,\\s*v[0-9]+ 12 pr110950 is not a regression, adapt testcase is enough. The rest FAILs which is caused by this patch: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d9dd06ad51b7479f09acb88adf404664a1e18b2a need to be recovered back. Robostify the gather index to fixe those FAILs. gcc/ChangeLog: * config/riscv/riscv-v.cc (get_gather_index_mode): New function. (shuffle_series_patterns): Robostify shuffle index. (shuffle_generic_patterns): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr110950.c: Adapt test.
2023-12-11Testsuite, asan, darwin: Adjust output patternFrancois-Xavier Coudert1-1/+1
Since the last import from upstream libsanitizer, the output has changed and now looks more like this: READ of size 6 at 0x7ff7beb2a144 thread T0 #0 0x101cf7796 in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) sanitizer_common_interceptors.inc:813 #1 0x101cf7b99 in memcmp sanitizer_common_interceptors.inc:840 #2 0x108a0c39f in __stack_chk_guard+0xf (dyld:x86_64+0x8039f) so let's adjust the pattern accordingly. gcc/testsuite/ChangeLog: * c-c++-common/asan/memcmp-1.c: Adjust pattern on darwin.
2023-12-11aarch64: arm_neon.h - Fix -Wincompatible-pointer-types errorsVictor Do Nascimento1-13/+21
In the Linux kernel, u64/s64 are [un]signed long long, not [un]signed long. This means that when the `arm_neon.h' header is used by the kernel, any use of the `uint64_t' / `in64_t' types needs to be correctly cast to the correct `__builtin_aarch64_simd_di' / `__builtin_aarch64_simd_df' types when calling the relevant ACLE builtins. This patch adds the necessary fixes to ensure that `vstl1_*' and `vldap1_*' intrinsics are correctly defined for use by the kernel. gcc/ChangeLog: * config/aarch64/arm_neon.h (vldap1_lane_u64): Add `const' to `__builtin_aarch64_simd_di *' cast. (vldap1q_lane_u64): Likewise. (vldap1_lane_s64): Cast __src to `const __builtin_aarch64_simd_di *'. (vldap1q_lane_s64): Likewise. (vldap1_lane_f64): Cast __src to `const __builtin_aarch64_simd_df *'. (vldap1q_lane_f64): Cast __src to `const __builtin_aarch64_simd_df *'. (vldap1_lane_p64): Add `const' to `__builtin_aarch64_simd_di *' cast. (vldap1q_lane_p64): Add `const' to `__builtin_aarch64_simd_di *' cast. (vstl1_lane_u64): remove stray `const'. (vstl1_lane_s64): Cast __src to `__builtin_aarch64_simd_di *'. (vstl1q_lane_s64): Likewise. (vstl1_lane_f64): Cast __src to `const __builtin_aarch64_simd_df *'. (vstl1q_lane_f64): Likewise.
2023-12-11d: Merge upstream dmd, druntime 2bbf64907c, phobos b64bfbf91Iain Buclaw45-518/+579
D front-end changes: - Import dmd v2.106.0. D runtime changes: - Import druntime v2.106.0. Phobos changes: - Import phobos v2.106.0. gcc/d/ChangeLog: * Make-lang.in (D_FRONTEND_OBJS): Rename d/common-string.o to d/common-smallbuffer.o. * dmd/MERGE: Merge upstream dmd 2bbf64907c. * dmd/VERSION: Bump version to v2.106.0. * modules.cc (layout_moduleinfo_fields): Update for new front-end interface. (layout_moduleinfo): Likewise. libphobos/ChangeLog: * libdruntime/MERGE: Merge upstream druntime 2bbf64907c. * src/MERGE: Merge upstream phobos b64bfbf91.
2023-12-11RISC-V: Rename test[NFC]Juzhe-Zhong1-0/+2
Since I want to commit multiple tests which are fixing vsetvl bugs, rename it to make testcases more easier maintain. Committed as it is obvious. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/avl_use_bug-1.c: Moved to... * gcc.target/riscv/rvv/vsetvl/vsetvl_bug-1.c: ...here.
2023-12-11RISC-V: Recognize stepped series in expand_vec_perm_const.Robin Dapp1-2/+64
We currently try to recognize various forms of stepped (const_vector) sequence variants in expand_const_vector. Because of complications with canonicalization and encoding it is easier to identify such patterns in expand_vec_perm_const_1 already where perm.series_p () is available. This patch introduces shuffle_series as new permutation pattern and tries to recognize series like [base0 base1 base1 + step ...]. If such a series is found the series is expanded by expand_vec_series and a gather is emitted. On top the patch fixes the step recognition in expand_const_vector for stepped series where such a series would end up before. This fixes several execution failures when running code compiled for a scalable vector size of 128 on a target with vlen = 256 or higher. The problem was only noticed there because the encoding for a reversed [2 2]-element vector ("3 2 1 0") is { [1 2], [0 2], [1 4] }. Some testcases that failed were: vect-alias-check-18.c vect-alias-check-1.F90 pr64365.c On a 128-bit target, only the first two elements are used. The third element causing the complications only comes into effect at vlen = 256. With this patch the testsuite results are similar with vlen = 128, vlen = 256 as well as vlen = 512 (apart from the fixed-vlmax tests of course). gcc/ChangeLog: PR target/112853 * config/riscv/riscv-v.cc (expand_const_vector): Fix step calculation. (modulo_sel_indices): Also perform modulo for variable-length constants. (shuffle_series): Recognize series permutations. (expand_vec_perm_const_1): Add shuffle_series.
2023-12-11Testsuite, i386: mark test as requiring dfpFrancois-Xavier Coudert1-0/+1
Test currently fails on darwin with: error: decimal floating-point not supported for this target gcc/testsuite/ChangeLog: * gcc.target/i386/pr112445.c: Require dfp.
2023-12-11Simplify vector ((VCE (a cmp b ? -1 : 0)) < 0) ? c : d to just (VCE ((a cmp ↵liuhongt3-0/+74
b) ? (VCE c) : (VCE d))). When I'm working on PR112443, I notice there's some misoptimizations: after we fold _mm{,256}_blendv_epi8/pd/ps into gimple, the backend fails to combine it back to v{,p}blendv{v,ps,pd} since the pattern is too complicated, so I think maybe we should hanlde it in the gimple level. The dump is like _1 = c_3(D) >= { 0, 0, 0, 0 }; _2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>; _7 = VIEW_CONVERT_EXPR<vector(32) char>(_2); _8 = VIEW_CONVERT_EXPR<vector(32) char>(b_6(D)); _9 = VIEW_CONVERT_EXPR<vector(32) char>(a_5(D)); _10 = _7 < { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; _11 = VEC_COND_EXPR <_10, _8, _9>; It can be optimized to _1 = c_2(D) >= { 0, 0, 0, 0 }; _6 = VEC_COND_EXPR <_1, b_5(D), a_4(D)>; since _7 is either -1 or 0, the selection of _7 < 0 ? _8 : _9 should be euqal to _1 ? b : a as long as TYPE_PRECISION of the component type of the second VEC_COND_EXPR is less equal to the first one. The patch add a gimple pattern to handle that. gcc/ChangeLog: * match.pd (VCE (a cmp b ? -1 : 0) < 0) ? c : d ---> (VCE ((a cmp b) ? (VCE:c) : (VCE:d))): New gimple simplication. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512vl-blendv-3.c: New test. * gcc.target/i386/blendv-3.c: New test.
2023-12-11Testsuite, Darwin: actually skip testFrancois-Xavier Coudert1-1/+1
Previous commit xfailed instead of skipping, but we really want to skip. gcc/testsuite/ChangeLog: * gcc.target/i386/libcall-1.c: Skip on darwin.
2023-12-11RISC-V: Support highest overlap for wv instructionsJuzhe-Zhong4-42/+360
According to RVV ISA, we can allow vwadd.wv v2, v2, v3 overlap. Before this patch: nop vsetivli zero,4,e8,m4,tu,ma vle16.v v8,0(a0) vmv8r.v v0,v8 vwsub.wv v0,v8,v12 nop addi a4,a0,100 vle16.v v8,0(a4) vmv8r.v v24,v8 vwsub.wv v24,v8,v12 nop addi a4,a0,200 vle16.v v8,0(a4) vmv8r.v v16,v8 vwsub.wv v16,v8,v12 nop After this patch: nop vsetivli zero,4,e8,m4,tu,ma vle16.v v0,0(a0) vwsub.wv v0,v0,v4 nop addi a4,a0,100 vle16.v v24,0(a4) vwsub.wv v24,v24,v28 nop addi a4,a0,200 vle16.v v16,0(a4) vwsub.wv v16,v16,v20 PR target/112431 gcc/ChangeLog: * config/riscv/vector.md: Support highest overlap for wv instructions. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr112431-39.c: New test. * gcc.target/riscv/rvv/base/pr112431-40.c: New test. * gcc.target/riscv/rvv/base/pr112431-41.c: New test.
2023-12-11RISC-V: Fix ICE in extract_single_sourceJuzhe-Zhong2-0/+41
This patch fixes the following ICE in VSETVL PASS: bug.c:39:1: internal compiler error: Segmentation fault 39 | } | ^ 0x1ad5a08 crash_signal ../../../../gcc/gcc/toplev.cc:316 0x7f7f55feb90f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x218d7c7 extract_single_source ../../../../gcc/gcc/config/riscv/riscv-vsetvl.cc:583 0x218d95d extract_single_source ../../../../gcc/gcc/config/riscv/riscv-vsetvl.cc:604 0x218fbc5 pre_vsetvl::compute_lcm_local_properties() ../../../../gcc/gcc/config/riscv/riscv-vsetvl.cc:2703 0x2190ef4 pre_vsetvl::earliest_fuse_vsetvl_info() ../../../../gcc/gcc/config/riscv/riscv-vsetvl.cc:2890 0x2193e62 pass_vsetvl::lazy_vsetvl() ../../../../gcc/gcc/config/riscv/riscv-vsetvl.cc:3537 0x219406a pass_vsetvl::execute(function*) ../../../../gcc/gcc/config/riscv/riscv-vsetvl.cc:3584 The rootcause we have a case that the def info can not be traced: (insn 208 327 333 27 (use (reg/i:DI 10 a0)) "bug.c":36:1 -1 (nil)) It's obvious, we conservatively disable any optimization in this situation if AVL def_info can not be tracded. Committed as it is obvious. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (extract_single_source): Fix ICE. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/avl_use_bug-1.c: New test.
2023-12-11extend.texi: Mark builtin arguments with @var{...}Jakub Jelinek1-147/+147
In many cases we just specify types for the builtin arguments, in other cases types and names with @var{name} syntax, and in other case with just name. Shall we tweak that somehow? If the argument names are unimportant, perhaps it is fine to leave that out, but shouldn't we always use @var{...} around the parameter names when specified? On Fri, Dec 01, 2023 at 10:43:57AM -0700, Sandra Loosemore wrote: > Yup. The Texinfo manual says: "When using @deftypefn command and > variations, you should mark parameter names with @var to distinguish these > from data type names, keywords, and other parts of the literal syntax of the > programming language." Here is a patch which does that (but not adding types to where they were missing, that will be harder to search for). 2023-12-11 Jakub Jelinek <jakub@redhat.com> * doc/extend.texi (__sync_fetch_and_add, __sync_fetch_and_sub, __sync_fetch_and_or, __sync_fetch_and_and, __sync_fetch_and_xor, __sync_fetch_and_nand, __sync_add_and_fetch, __sync_sub_and_fetch, __sync_or_and_fetch, __sync_and_and_fetch, __sync_xor_and_fetch, __sync_nand_and_fetch, __sync_bool_compare_and_swap, __sync_val_compare_and_swap, __sync_lock_test_and_set, __sync_lock_release, __atomic_load_n, __atomic_load, __atomic_store_n, __atomic_store, __atomic_exchange_n, __atomic_exchange, __atomic_compare_exchange_n, __atomic_compare_exchange, __atomic_add_fetch, __atomic_sub_fetch, __atomic_and_fetch, __atomic_xor_fetch, __atomic_or_fetch, __atomic_nand_fetch, __atomic_fetch_add, __atomic_fetch_sub, __atomic_fetch_and, __atomic_fetch_xor, __atomic_fetch_or, __atomic_fetch_nand, __atomic_test_and_set, __atomic_clear, __atomic_thread_fence, __atomic_signal_fence, __atomic_always_lock_free, __atomic_is_lock_free, __builtin_add_overflow, __builtin_sadd_overflow, __builtin_saddl_overflow, __builtin_saddll_overflow, __builtin_uadd_overflow, __builtin_uaddl_overflow, __builtin_uaddll_overflow, __builtin_sub_overflow, __builtin_ssub_overflow, __builtin_ssubl_overflow, __builtin_ssubll_overflow, __builtin_usub_overflow, __builtin_usubl_overflow, __builtin_usubll_overflow, __builtin_mul_overflow, __builtin_smul_overflow, __builtin_smull_overflow, __builtin_smulll_overflow, __builtin_umul_overflow, __builtin_umull_overflow, __builtin_umulll_overflow, __builtin_add_overflow_p, __builtin_sub_overflow_p, __builtin_mul_overflow_p, __builtin_addc, __builtin_addcl, __builtin_addcll, __builtin_subc, __builtin_subcl, __builtin_subcll, __builtin_alloca, __builtin_alloca_with_align, __builtin_alloca_with_align_and_max, __builtin_speculation_safe_value, __builtin_nan, __builtin_nand32, __builtin_nand64, __builtin_nand128, __builtin_nanf, __builtin_nanl, __builtin_nanf@var{n}, __builtin_nanf@var{n}x, __builtin_nans, __builtin_nansd32, __builtin_nansd64, __builtin_nansd128, __builtin_nansf, __builtin_nansl, __builtin_nansf@var{n}, __builtin_nansf@var{n}x, __builtin_ffs, __builtin_clz, __builtin_ctz, __builtin_clrsb, __builtin_popcount, __builtin_parity, __builtin_bswap16, __builtin_bswap32, __builtin_bswap64, __builtin_bswap128, __builtin_extend_pointer, __builtin_goacc_parlevel_id, __builtin_goacc_parlevel_size, vec_clrl, vec_clrr, vec_mulh, vec_mul, vec_div, vec_dive, vec_mod, __builtin_rx_mvtc): Use @var{...} around parameter names. (vec_rl, vec_sl, vec_sr, vec_sra): Likewise. Use @var{...} also around A, B and R in description.
2023-12-11RISC-V: Remove poly selftest when --preference=fixed-vlmaxJuzhe-Zhong2-1/+25
This patch fixes multiple ICEs in full coverage testing: cc1: internal compiler error: in riscv_legitimize_poly_move, at config/riscv/riscv.cc:2456^M 0x1fd8d78 riscv_legitimize_poly_move^M ../../../../gcc/gcc/config/riscv/riscv.cc:2456^M 0x1fd9518 riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)^M ../../../../gcc/gcc/config/riscv/riscv.cc:2583^M 0x2936820 gen_movdi(rtx_def*, rtx_def*)^M ../../../../gcc/gcc/config/riscv/riscv.md:2099^M 0x11a0f28 rtx_insn* insn_gen_fn::operator()<rtx_def*, rtx_def*>(rtx_def*, rtx_def*) const^M ../../../../gcc/gcc/recog.h:431^M 0x13cf2f9 emit_move_insn_1(rtx_def*, rtx_def*)^M ../../../../gcc/gcc/expr.cc:4553^M 0x13d010c emit_move_insn(rtx_def*, rtx_def*)^M ../../../../gcc/gcc/expr.cc:4723^M 0x216f5e0 run_poly_int_selftest^M ../../../../gcc/gcc/config/riscv/riscv-selftests.cc:185^M 0x21701e6 run_poly_int_selftests^M ../../../../gcc/gcc/config/riscv/riscv-selftests.cc:226^M 0x2172109 selftest::riscv_run_selftests()^M ../../../../gcc/gcc/config/riscv/riscv-selftests.cc:371^M 0x3b8067b selftest::run_tests()^M ../../../../gcc/gcc/selftest-run-tests.cc:112^M 0x1ad90ee toplev::run_self_tests()^M ../../../../gcc/gcc/toplev.cc:2209^M Running target riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m1/--param=riscv-autovec-preference=fixed-vlmax The rootcause is that we are testing POLY value computation during FIXED-VLMAX and ICE in this code: if (BYTES_PER_RISCV_VECTOR.is_constant ()) { gcc_assert (value.is_constant ()); -----> assert failed. riscv_emit_move (dest, GEN_INT (value.to_constant ())); return; } For example, a poly value [15, 16] is computed by csrr vlen + multiple scalar integer instructions. However, such compile-time unknown value need to be computed when it is scalable vector, that is !BYTES_PER_RISCV_VECTOR.is_constant (), since csrr vlenb = [16, 0] when -march=rv64gcv --param=riscv-autovec-preference=fixed-vlmax and we have no chance to compute compile-time POLY value. Also, we never reach the situation to compute a compile time unknown value when it is FIXED-VLMAX vector. So disable POLY selftest for FIXED-VLMAX. gcc/ChangeLog: * config/riscv/riscv-selftests.cc (riscv_run_selftests): Remove poly self test when FIXED-VLMAX. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/poly-selftest-1.c: New test.
2023-12-10[PATCH 3/5] [ifcvt] optimize x=c ? (y AND z) : y by RISC-V Zicond like insnsFei Gao2-21/+211
Take the following case for example. CFLAGS: -march=rv64gc_zbb_zicond -mabi=lp64d -O2 long test_AND_ceqz (long x, long y, long z, long c) { if (c) x = y & z; else x = y; return x; } Before patch: and a2,a1,a2 czero.eqz a0,a2,a3 czero.nez a3,a1,a3 or a0,a3,a0 ret After patch: and a0,a1,a2 czero.nez a1,a1,a3 or a0,a1,a0 ret Co-authored-by: Xiao Zeng<zengxiao@eswincomputing.com> gcc/ChangeLog: * ifcvt.cc (noce_cond_zero_binary_op_supported): Add support for AND. (noce_bbs_ok_for_cond_zero_arith): Likewise. (noce_try_cond_zero_arith): Likewise. gcc/testsuite/ChangeLog: * gcc.target/riscv/zicond_ifcvt_opt.c: Add TCs for AND.
2023-12-11c++: Fix noexcept checking for trivial operations [PR96090]Nathaniel Shead5-14/+143
This patch stops eager folding of trivial operations (construction and assignment) from occurring when checking for noexceptness. This was previously done in PR c++/53025, but only for copy/move construction, and the __is_nothrow_xible builtins did not receive the same treatment when they were added. To handle `is_nothrow_default_constructible`, the patch also ensures that when no parameters are passed we do value initialisation instead of just building the constructor call: in particular, value-initialisation doesn't necessarily actually invoke the constructor for trivial default constructors, and so we need to handle this case as well. This is contrary to the proposed resolution of CWG2820; for now we just ensure it matches the behaviour of the `noexcept` operator and create testcases formalising this, and if that issue gets accepted we can revisit. PR c++/96090 PR c++/100470 gcc/cp/ChangeLog: * call.cc (build_over_call): Prevent folding of trivial special members when checking for noexcept. * method.cc (constructible_expr): Perform value-initialisation for empty parameter lists. (is_nothrow_xible): Treat as noexcept operator. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/noexcept81.C: New test. * g++.dg/ext/is_nothrow_constructible7.C: New test. * g++.dg/ext/is_nothrow_constructible8.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2023-12-11c++: Clear uninstantiated template friend when instantiating [PR104234]Nathaniel Shead2-0/+18
Otherwise attempting to get the originating module declaration ICEs because the DECL_CHAIN of an instantiated friend template is no longer its context. PR c++/104234 PR c++/112580 gcc/cp/ChangeLog: * pt.cc (tsubst_template_decl): Clear DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P. gcc/testsuite/ChangeLog: * g++.dg/modules/pr104234.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2023-12-11Support vpcmov for V4HF/V4BF/V2HF/V2BF under TARGET_XOP.liuhongt2-0/+49
gcc/ChangeLog: PR target/112904 * config/i386/mmx.md (*xop_pcmov_<mode>): New define_insn. gcc/testsuite/ChangeLog: * g++.target/i386/pr112904.C: New test.
2023-12-11rs6000: Guard fctid on PowerPC64 and PowerPC476Haochen Gui8-4/+35
fctid is only supported on 64-bit Power processors and powerpc 476. It should be guarded by this condition. The patch fixes the issue. gcc/ PR target/112707 * config/rs6000/rs6000.h (TARGET_FCTID): Define. * config/rs6000/rs6000.md (lrint<mode>di2): Add guard TARGET_FCTID. * (lround<mode>di2): Replace TARGET_FPRND with TARGET_FCTID. gcc/testsuite/ PR target/112707 * gcc.target/powerpc/pr112707.h: New. * gcc.target/powerpc/pr112707-2.c: New. * gcc.target/powerpc/pr112707-3.c: New. * gcc.target/powerpc/pr88558-p7.c: Check fctid on ilp32 and has_arch_ppc64 as it's now guarded by powerpc64. * gcc.target/powerpc/pr88558-p8.c: Likewise. * gfortran.dg/nint_p7.f90: Add powerpc64 target requirement as lround<mode>di2 is now guarded by powerpc64.
2023-12-11rs6000: Enable lrint<mode>si2 on old archs with stfiwx enabledHaochen Gui2-1/+45
The powerpc 32-bit processors (e.g. 5470) supports "fctiw" instruction, but the instruction can't be generated on such platforms as the insn is guard by TARGET_POPCNTD. The root cause is SImode in float register is supported from Power7. Actually implementation of "fctiw" only needs stfiwx which is supported by the old 32-bit processors. This patch enables "fctiw" expand for these processors. gcc/ PR target/112707 * config/rs6000/rs6000.md (expand lrint<mode>si2): New. (insn lrint<mode>si2): Rename to... (*lrint<mode>si): ...this. (lrint<mode>si_di): New. gcc/testsuite/ PR target/112707 * gcc.target/powerpc/pr112707-1.c: New.
2023-12-11Daily bump.GCC Administrator7-1/+350