aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-05-15ada: Fix link to parent when copying with Copy_Separate_TreePiotr Trojanek3-61/+71
When flag More_Ids is set on a node, then syntactic children will have their Parent link set to the last node in the chain of Mode_Ids. For example, parameter associations in declaration like: procedure P (X, Y : T); will have More_Ids set for "X", Prev_Ids set on "Y" and both will have the same node of "T" as their child. However, "T" will have only one parent, i.e. "Y". This anomaly was taken into account in New_Copy_Tree, but not in Copy_Separate_Tree. This was leading to spurious errors in check for ghost-correctness applied to copied specs. gcc/ada/ * atree.ads (Is_Syntactic_Node): Refactored from New_Copy_Tree. * atree.adb (Is_Syntactic_Node): Likewise. (Copy_Separate_Tree): Use Is_Syntactic_Node. * sem_util.adb (Has_More_Ids): Move to Atree. (Is_Syntactic_Node): Likewise.
2023-05-15aarch64: PR target/99195 annotate vector compare patterns for vec-concat-zeroKyrylo Tkachov2-7/+103
This instalment of the series goes through the vector comparison patterns in the backend. One wart are the int64x1_t comparisons that this patch doesn't touch. Those are a bit trickier because they have define_insn_and_split mechanisms for falling back to GP reg comparisons after reload and I don't think a simple annotation will catch those cases correctly. Those will need more custom thinking. As said, this patch doesn't touch those and is a decent straightforward improvement on its own. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: PR target/99195 * config/aarch64/aarch64-simd.md (aarch64_cm<optab><mode>): Rename to... (aarch64_cm<optab><mode><vczle><vczbe>): ... This. (aarch64_cmtst<mode>): Rename to... (aarch64_cmtst<mode><vczle><vczbe>): ... This. (*aarch64_cmtst_same_<mode>): Rename to... (*aarch64_cmtst_same_<mode><vczle><vczbe>): ... This. (*aarch64_cmtstdi): Rename to... (*aarch64_cmtstdi<vczle><vczbe>): ... This. (aarch64_fac<optab><mode>): Rename to... (aarch64_fac<optab><mode><vczle><vczbe>): ... This. gcc/testsuite/ChangeLog: PR target/99195 * gcc.target/aarch64/simd/pr99195_7.c: New test.
2023-05-15aarch64: PR target/99195 annotate qabs,qneg patterns for vec-concat-zeroKyrylo Tkachov2-1/+11
Straightforward like previous patches in this series. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: PR target/99195 * config/aarch64/aarch64-simd.md (aarch64_s<optab><mode>): Rename to... (aarch64_s<optab><mode><vczle><vczbe>): ... This. gcc/testsuite/ChangeLog: PR target/99195 * gcc.target/aarch64/simd/pr99195_4.c: Add testing for qabs, qneg.
2023-05-15RISC-V: Optimize vsetvl AVL for VLS VLMAX auto-vectorizationPan Li2-3/+38
This patch is optimizing the AVL for VLS auto-vectorzation. Given below sample code: typedef int8_t vnx2qi __attribute__ ((vector_size (2))); __attribute__ ((noipa)) void f_vnx2qi (int8_t a, int8_t b, int8_t *out) { vnx2qi v = {a, b}; *(vnx2qi *) out = v; } Before this patch: f_vnx2qi: vsetvli a5,zero,e8,mf8,ta,ma vmv.v.x v1,a0 vslide1down.vx v1,v1,a1 vse8.v v1,0(a2) ret After this patch: f_vnx2qi: vsetivli zero,2,e8,mf8,ta,ma vmv.v.x v1,a0 vslide1down.vx v1,v1,a1 vse8.v v1,0(a2) ret Signed-off-by: Pan Li <pan2.li@intel.com> Co-authored-by: Juzhe-Zhong <juzhe.zhong@rivai.ai> Co-authored-by: kito-cheng <kito.cheng@sifive.com> gcc/ChangeLog: * config/riscv/riscv-v.cc (const_vlmax_p): New function for deciding the mode is constant or not. (set_len_and_policy): Optimize VLS-VLMAX code gen to vsetivli. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vf_avl-1.c: New test.
2023-05-15tree-optimization/109848 - fix TARGET_MEM_REF store from CTOR simplificationRichard Biener1-1/+4
I've put the preparation stmt in the wrong place. PR tree-optimization/109848 * tree-ssa-forwprop.cc (pass_forwprop::execute): Put the TARGET_MEM_REF address preparation before the store, not before the CTOR.
2023-05-15Fix gcc.dg/vect/pr108950.cRichard Biener1-1/+1
The following puts the dg-require-effective-target properly after the dg-do. * gcc.dg/vect/pr108950.c: Re-order dg-require-effective-target and dg-do.
2023-05-15RISC-V: Support TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT to optimize ↵Juzhe-Zhong4-4/+44
codegen of both VLA && VLS auto-vectorization This patch optimizes both RVV VLA && VLS vectorization. Consider this following case: void __attribute__((noinline, noclone)) f (int * __restrict dst, int * __restrict op1, int * __restrict op2, int count) { for (int i = 0; i < count; ++i) dst[i] = op1[i] + op2[i]; } VLA: Before this patch: ble a3,zero,.L1 srli a4,a1,2 negw a4,a4 andi a5,a4,3 sext.w a3,a3 beq a5,zero,.L3 lw a7,0(a1) lw a6,0(a2) andi a4,a4,2 addw a6,a6,a7 sw a6,0(a0) beq a4,zero,.L3 lw a7,4(a1) lw a4,4(a2) li a6,3 addw a4,a4,a7 sw a4,4(a0) bne a5,a6,.L3 lw a6,8(a2) lw a4,8(a1) addw a4,a4,a6 sw a4,8(a0) .L3: subw a3,a3,a5 slli a4,a3,32 csrr a6,vlenb srli a4,a4,32 srli a6,a6,2 slli a3,a5,2 mv a5,a4 bgtu a4,a6,.L17 .L5: csrr a6,vlenb add a1,a1,a3 add a2,a2,a3 add a0,a0,a3 srli a7,a6,2 li a3,0 .L8: vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1) vle32.v v2,0(a2) vsetvli t1,zero,e32,m1,ta,ma add a3,a3,a7 vadd.vv v1,v1,v2 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) mv a5,a4 bleu a4,a3,.L6 mv a5,a3 .L6: sub a5,a4,a5 bleu a5,a7,.L7 mv a5,a7 .L7: add a1,a1,a6 add a2,a2,a6 add a0,a0,a6 bne a5,zero,.L8 .L1: ret .L17: mv a5,a6 j .L5 After this patch: f: ble a3,zero,.L1 csrr a4,vlenb srli a4,a4,2 mv a5,a3 bgtu a3,a4,.L9 .L3: csrr a6,vlenb li a4,0 srli a7,a6,2 .L6: vsetvli zero,a5,e32,m1,ta,ma vle32.v v2,0(a1) vle32.v v1,0(a2) vsetvli t1,zero,e32,m1,ta,ma add a4,a4,a7 vadd.vv v1,v1,v2 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) mv a5,a3 bleu a3,a4,.L4 mv a5,a4 .L4: sub a5,a3,a5 bleu a5,a7,.L5 mv a5,a7 .L5: add a0,a0,a6 add a2,a2,a6 add a1,a1,a6 bne a5,zero,.L6 .L1: ret .L9: mv a5,a4 j .L3 VLS: Before this patch: f3: ble a3,zero,.L1 srli a5,a1,2 negw a5,a5 andi a4,a5,3 sext.w a3,a3 beq a4,zero,.L3 lw a7,0(a1) lw a6,0(a2) andi a5,a5,2 addw a6,a6,a7 sw a6,0(a0) beq a5,zero,.L3 lw a7,4(a1) lw a5,4(a2) li a6,3 addw a5,a5,a7 sw a5,4(a0) bne a4,a6,.L3 lw a6,8(a2) lw a5,8(a1) addw a5,a5,a6 sw a5,8(a0) .L3: subw a3,a3,a4 slli a6,a4,2 slli a5,a3,32 srli a5,a5,32 add a1,a1,a6 add a2,a2,a6 add a0,a0,a6 li a3,4 .L6: mv a4,a5 bleu a5,a3,.L5 li a4,4 .L5: vsetvli zero,a4,e32,m1,ta,ma vle32.v v1,0(a1) vle32.v v2,0(a2) vsetivli zero,4,e32,m1,ta,ma sub a5,a5,a4 vadd.vv v1,v1,v2 vsetvli zero,a4,e32,m1,ta,ma vse32.v v1,0(a0) addi a1,a1,16 addi a2,a2,16 addi a0,a0,16 bne a5,zero,.L6 .L1: ret After this patch: f3: ble a3,zero,.L1 li a4,4 .L4: mv a5,a3 bleu a3,a4,.L3 li a5,4 .L3: vsetvli zero,a5,e32,m1,ta,ma vle32.v v2,0(a1) vle32.v v1,0(a2) vsetivli zero,4,e32,m1,ta,ma sub a3,a3,a5 vadd.vv v1,v1,v2 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) addi a2,a2,16 addi a0,a0,16 addi a1,a1,16 bne a3,zero,.L4 .L1: ret Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/riscv.cc (riscv_vectorize_preferred_vector_alignment): New function. (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New target hook. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/shift-rv32gcv.c: Adapt testcase. * gcc.target/riscv/rvv/autovec/align-1.c: New test. * gcc.target/riscv/rvv/autovec/align-2.c: New test.
2023-05-15Daily bump.GCC Administrator3-1/+35
2023-05-14MATCH: Add pattern for `signbit(x) ? x : -x` into abs (and swapped)Andrew Pinski3-0/+36
This adds a simple pattern to match.pd for `signbit(x) ? x : -x` into abs<x>. This can be done for all types even ones that honor signed zeros and NaNs because both signbit and - are considered only looking at/touching the sign bit of those types and does not trap either. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/109829 gcc/ChangeLog: * match.pd: Add pattern for `signbit(x) !=/== 0 ? x : -x`. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/abs-3.c: New test. * gcc.dg/tree-ssa/abs-4.c: New test.
2023-05-14i386: Handle unsupported modes from ix86_widen_mult_cost [PR109807]Uros Bizjak2-3/+6
Revert my previous change that faked handling of V4HI and V2SImodes in ix86_widen_mult_cost and rather return arbitrary high value for unsupported modes. This should prevent cost estimator from selecting non-existent vector widen multiply operation. gcc/ChangeLog: PR target/109807 * config/i386/i386.cc: Revert the 2023-05-11 change. (ix86_widen_mult_cost): Return high value instead of ICEing for unsupported modes. gcc/testsuite/ChangeLog: PR target/109807 * gcc.target/i386/pr109825.c: New test.
2023-05-14i386: Honour -mdirect-extern-access when calling __fentry__Ard Biesheuvel1-2/+6
The small and medium PIC code models generate profiling calls that always load the address of __fentry__() via the GOT, even if -mdirect-extern-access is in effect. This deviates from the behavior with respect to other external references, and results in a longer opcode that relies on linker relaxation to eliminate the GOT load. In this particular case, the transformation replaces an indirect 'CALL *__fentry__@GOTPCREL(%rip)' with either 'CALL __fentry__; NOP' or 'NOP; CALL __fentry__', where the NOP is a 1 byte NOP that preserves the 6 byte length of the sequence. This is problematic for the Linux kernel, which generally relies on -mdirect-extern-access and hidden visibility to eliminate GOT based symbol references in code generated with -fpie/-fpic, without having to depend on linker relaxation. The Linux kernel relies on code patching to replace these opcodes with NOPs at runtime, and this is complicated code that we'd prefer not to complicate even more by adding support for patching both 5 and 6 byte sequences as well as parsing the instruction stream to decide which variant of CALL+NOP we are dealing with. So let's honour -mdirect-extern-access, and only load the address of __fentry__ via the GOT if direct references to external symbols are not permitted. Note that the GOT reference in question is in fact a data reference: we explicitly load the address of __fentry__ from the GOT, which amounts to eager binding, rather than emitting a PLT call that could bind eagerly, lazily or directly at link time. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> gcc/ChangeLog: * config/i386/i386.cc (x86_function_profiler): Take ix86_direct_extern_access into account when generating calls to __fentry__()
2023-05-14RISC-V: Refactor the or pattern to switch casesPan Li1-11/+26
This patch refactor the pattern A or B or C or D, to the switch case for easy add/remove new types, as well as human reading friendly. Before this patch: return A || B || C || D; After this patch: switch (type) { case A: case B: case C: case D: return true; default: return false; } Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (required_extensions_p): Refactor the or pattern to switch cases.
2023-05-14Daily bump.GCC Administrator4-1/+64
2023-05-13Replace bool as boolean instead of int in libgm2Gaius Mulley4-42/+14
This patch tidies KeyBoardLEDs.cc, RTco.cc, sckt.cc and wrapc.cc by removing the TRUE/FALSE macros and using bool, true and false. libgm2/ChangeLog: * libm2cor/KeyBoardLEDs.cc (TRUE): Remove. (FALSE): Remove. (init): Replace TRUE with true. * libm2iso/RTco.cc (TRUE): Remove. (FALSE): Remove. (initSem): Replace int with bool. (init): Replace FALSE with false. * libm2pim/sckt.cc (TRUE): Remove. (FALSE): Remove. * libm2pim/wrapc.cc: Replace TRUE with true and FALSE with false. (FALSE): Remove. (TRUE): Remove. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-05-13[aarch64] Recursively intialize even and odd sub-parts and merge with zip1.Prathamesh Kulkarni12-81/+180
gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback): Rename aarch64_expand_vector_init to this, and remove interleaving case. Recursively call aarch64_expand_vector_init_fallback, instead of aarch64_expand_vector_init. (aarch64_unzip_vector_init): New function. (aarch64_expand_vector_init): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldp_stp_16.c (cons2_8_float): Adjust for new code-gen. * gcc.target/aarch64/sve/acle/general/dupq_5.c: Likewise. * gcc.target/aarch64/sve/acle/general/dupq_6.c: Likewise. * gcc.target/aarch64/interleave-init-1.c: Rename to ... * gcc.target/aarch64/vec-init-18.c: ... this. * gcc.target/aarch64/vec-init-19.c: New test. * gcc.target/aarch64/vec-init-20.c: Likewise. * gcc.target/aarch64/vec-init-21.c: Likewise. * gcc.target/aarch64/vec-init-22-size.c: Likewise. * gcc.target/aarch64/vec-init-22-speed.c: Likewise. * gcc.target/aarch64/vec-init-22.h: New header.
2023-05-13RISC-V: Pull out function call with side effect from gcc_assert.Kito Cheng1-1/+2
It will broken when release mode. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::cleanup_insns): Pull out function call from the gcc_assert.
2023-05-13RISC-V: Improve vector_insn_info::dump for LMUL and policyKito Cheng1-3/+36
Convert vlmul and policy to human readable string, some example below: Before: [VALID,Demand field={1(VL),0(DEMAND_NONZERO_AVL),1(SEW),0(DEMAND_GE_SEW),1(LMUL),0(RATIO),0(TAIL_POLICY),0(MASK_POLICY)} AVL=(reg:DI 0 zero) SEW=16,VLMUL=3,RATIO=2,TAIL_POLICY=1,MASK_POLICY=1] ^ ^ ^ After: [VALID,Demand field={1(VL),0(DEMAND_NONZERO_AVL),1(SEW),0(DEMAND_GE_SEW),1(LMUL),0(RATIO),0(TAIL_POLICY),0(MASK_POLICY)} AVL=(reg:DI 0 zero) SEW=16,VLMUL=m8,RATIO=2,TAIL_POLICY=agnostic,MASK_POLICY=agnostic] ^^ ^^^^^^^^ ^^^^^^^^ gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (vlmul_to_str): New. (policy_to_str): New. (vector_insn_info::dump): Use vlmul_to_str and policy_to_str.
2023-05-12MATCH: Fix PR 109834, ICE with popcount combined with bswapAndrew Pinski3-2/+17
After r14-673-gc0dd80e4c4c3, there was a check in the match patterns which was checking the type is unsigned but instead of using the type, the patch used the expression. This adds the needed TREE_TYPE so get the correct answer and don't ICE. Committed as obvious after a bootstrap/test on x86_64-linux-gnu. PR tree-optimization/109834 gcc/ChangeLog: * match.pd (popcount(bswap(x))->popcount(x)): Fix up unsigned type checking. (popcount(rotate(x,y))->popcount(x)): Likewise. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/pr109834-1.c: New test. * gcc.dg/tree-ssa/pr109834-1.c: New test.
2023-05-13Daily bump.GCC Administrator7-1/+1042
2023-05-12Fortran: Revise a namelist test case.Jerry DeLisle1-4/+17
PR fortran/109662 gcc/testsuite/ChangeLog: * gfortran.dg/pr109662-a.f90: Add a section to verify that a short namelist read does not modify the variable.
2023-05-12Fortran: Initialize last_char for internal units.Jerry DeLisle1-0/+1
PR fortran/109662 libgfortran/ChangeLog: * io/unit.c (set_internal_unit): Set the internal unit last_char to zero so that previous EOF characters do not influence the next read.
2023-05-12i386: Cleanup ix86_expand_vecop_qihi{,2}Uros Bizjak1-27/+37
Some cleanups while looking at these two functions. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_vecop_qihi2): Also reject ymm instructions for TARGET_PREFER_AVX128. Use generic gen_extend_insn to generate zero/sign extension instructions. Fix comments. (ix86_expand_vecop_qihi): Initialize interleave functions for MULT code only. Fix comments.
2023-05-12libstdc++: Fix -Wnonnull warnings during configureJonathan Wakely2-8/+8
We should not test for nan by passing it a null pointer, as this can trigger -Wnonnull warnings. Also fix an outdated comment about the default -std mode. libstdc++-v3/ChangeLog: * acinclude.m4 (GLIBCXX_CHECK_C99_TR1): Use a non-null pointer to check for nan, nanf, and nanl. * configure: Regenerate.
2023-05-12libstdc++: Remove redundant dependencies on _GLIBCXX_USE_C99_STDINT_TR1Jonathan Wakely2-12/+2
We never need to use std::make_unsigned in std::char_traits<char16_t> and std::char_traits<char32_t> because <cstdint> guarantees to provide the types we need, since r9-2028-g8ba7f29e3dd064. Similarly, experimental::source_location can just assume uint_least32_t is defined by <cstdint>. libstdc++-v3/ChangeLog: * include/bits/char_traits.h (char_traits<char16_t>): Do not depend on _GLIBCXX_USE_C99_STDINT_TR1. (char_traits<char32_t>): Likewise. * include/experimental/source_location: Likewise.
2023-05-12libstdc++: Reduce <atomic> dependency on _GLIBCXX_USE_C99_STDINT_TR1Jonathan Wakely2-7/+2
Since r9-2028-g8ba7f29e3dd064 we've defined most of <cstdint> unconditionally, so we can do the same for most of the std::atomic aliases such as std::atomic_int_least32_t. The only aliases that need to depend on _GLIBCXX_USE_C99_STDINT_TR1 are the ones for the integer types that are not guaranteed to be defined, e.g. std::atomic_int32_t. libstdc++-v3/ChangeLog: * include/std/atomic (atomic_int_least8_t, atomic_uint_least8_t) (atomic_int_least16_t, atomic_uint_least16_t) (atomic_int_least32_t, atomic_uint_least32_t) (atomic_int_least64_t, atomic_uint_least64_t) (atomic_int_fast16_t, atomic_uint_fast16_t) (atomic_int_fast32_t, atomic_uint_fast32_t) (atomic_int_fast64_t, atomic_uint_fast64_t) (atomic_intmax_t, atomic_uintmax_t): Define unconditionally. * testsuite/29_atomics/headers/stdatomic.h/c_compat.cc: Adjust.
2023-05-12libstdc++: Remove <random> dependency on _GLIBCXX_USE_C99_STDINT_TR1Jonathan Wakely8-27/+4
Since r9-2028-g8ba7f29e3dd064 we've defined most of <cstdint> unconditionally, including uint_least32_t. This means that all of <random> can be defined unconditionally, which means that std::shuffle and std::ranges::shuffle can be too. libstdc++-v3/ChangeLog: * include/bits/algorithmfwd.h (shuffle): Do not depend on _GLIBCXX_USE_C99_STDINT_TR1. * include/bits/ranges_algo.h (shuffle): Likewise. * include/bits/stl_algo.h (shuffle): Likewise. * include/ext/random: Likewise. * include/ext/throw_allocator.h (random_condition): Likewise. * include/std/random: Likewise. * src/c++11/cow-string-inst.cc: Likewise. * src/c++11/random.cc: Likewise.
2023-05-12PR modula2/109830 m2iso library SeqFile.mod appending to a file overwrites ↵Gaius Mulley2-21/+101
content This patch is for the m2iso library SeqFile.mod to fix a bug when a file is opened using OpenAppend. The patch checks to see if the file exists and it uses FIO.OpenForRandom to ensure the file is not overwritten. gcc/m2/ChangeLog: PR modula2/109830 * gm2-libs-iso/SeqFile.mod (newCid): New parameter toAppend used to select FIO.OpenForRandom. (OpenRead): Pass extra parameter to newCid. (OpenWrite): Pass extra parameter to newCid. (OpenAppend): Pass extra parameter to newCid. gcc/testsuite/ChangeLog: PR modula2/109830 * gm2/isolib/run/pass/seqappend.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-05-12i386: Remove mulv2si emulated sequence for TARGET_SSE2 [PR109797]Uros Bizjak1-33/+1
Remove mulv2si emulated sequence for TARGET_SSE2 and enable only native PMULLD instruction for TARGET_SSE4_1. Ideally, the vectorization for TARGET_SSE2 should depend on more precise cost estimation (the PR contains patch for ix86_multiplication_cost), but even with patched cost function the runtime regression was not fixed. PR target/109797 gcc/ChangeLog: * config/i386/mmx.md (mulv2si3): Remove expander. (mulv2si3): Rename insn pattern from *mulv2si.
2023-05-12LTO: Fix writing of toplevel asm with offloading [PR109816]Tobias Burnus3-1/+105
When offloading was enabled, top-level 'asm' were added to the offloading section, confusing assemblers which did not support the syntax. Additionally, with offloading and -flto, the top-level assembler code did not end up in the host files. As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file, the issue became more apparent, causing fails with nvptx for some C++ testcases. PR libstdc++/109816 gcc/ChangeLog: * lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by '!lto_stream_offload_p'. libgomp/ChangeLog: * testsuite/libgomp.c++/target-map-class-1.C: New test. * testsuite/libgomp.c++/target-map-class-2.C: New test.
2023-05-12libstdc++: Remove test dependency on _GLIBCXX_USE_C99_STDINT_TR1Jonathan Wakely1-1/+1
This should have been done in r9-2028-g8ba7f29e3dd064 when std::shared_mutex was changed to be defined without depending on _GLIBCXX_USE_C99_STDINT_TR1. libstdc++-v3/ChangeLog: * testsuite/experimental/feat-cxx14.cc: Remove dependency on _GLIBCXX_USE_C99_STDINT_TR1.
2023-05-12libstdc++: Remove test dependency on _GLIBCXX_USE_C99_STDINT_TR1Jonathan Wakely1-4/+0
This should have been removed in r9-2029-g612c9c702e2c9e when the char16_t and char32_t specializations of std::codecvt were changed to be defined unconditionally. libstdc++-v3/ChangeLog: * testsuite/22_locale/locale/cons/unicode.cc: Remove dependency on _GLIBCXX_USE_C99_STDINT_TR1.
2023-05-12libstdc++: Remove test dependencies on _GLIBCXX_USE_C99_STDINT_TR1Jonathan Wakely2-4/+0
These #ifdef checks should have been removed in r9-2029-g612c9c702e2c9e when the u16string_view and u32string_view aliases were changed to be defined unconditionally. libstdc++-v3/ChangeLog: * testsuite/21_strings/basic_string_view/typedefs.cc: Remove dependency on _GLIBCXX_USE_C99_STDINT_TR1. * testsuite/experimental/string_view/typedefs.cc: Likewise.
2023-05-12RISC-V: Optimize vsetvli of LCM INSERTED edge for user vsetvli [PR 109743]Kito Cheng5-45/+277
Rebase to trunk and send V3 patch for: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617821.html This patch is fixing: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109743. This issue happens is because we are currently very conservative in optimization of user vsetvli. Consider this following case: bb 1: vsetvli a5,a4... (demand AVL = a4). bb 2: RVV insn use a5 (demand AVL = a5). LCM will hoist vsetvl of bb 2 into bb 1. We don't do AVL propagation for this situation since it's complicated that we should analyze the code sequence between vsetvli in bb 1 and RVV insn in bb 2. They are not necessary the consecutive blocks. This patch is doing the optimizations after LCM, we will check and eliminate the vsetvli in LCM inserted edge if such vsetvli is redundant. Such approach is much simplier and safe. code: void foo2 (int32_t *a, int32_t *b, int n) { if (n <= 0) return; int i = n; size_t vl = __riscv_vsetvl_e32m1 (i); for (; i >= 0; i--) { vint32m1_t v = __riscv_vle32_v_i32m1 (a, vl); __riscv_vse32_v_i32m1 (b, v, vl); if (i >= vl) continue; if (i == 0) return; vl = __riscv_vsetvl_e32m1 (i); } } Before this patch: foo2: .LFB2: .cfi_startproc ble a2,zero,.L1 mv a4,a2 li a3,-1 vsetvli a5,a2,e32,m1,ta,mu vsetvli zero,a5,e32,m1,ta,ma <- can be eliminated. .L5: vle32.v v1,0(a0) vse32.v v1,0(a1) bgeu a4,a5,.L3 .L10: beq a2,zero,.L1 vsetvli a5,a4,e32,m1,ta,mu addi a4,a4,-1 vsetvli zero,a5,e32,m1,ta,ma <- can be eliminated. vle32.v v1,0(a0) vse32.v v1,0(a1) addiw a2,a2,-1 bltu a4,a5,.L10 .L3: addiw a2,a2,-1 addi a4,a4,-1 bne a2,a3,.L5 .L1: ret After this patch: f: ble a2,zero,.L1 mv a4,a2 li a3,-1 vsetvli a5,a2,e32,m1,ta,ma .L5: vle32.v v1,0(a0) vse32.v v1,0(a1) bgeu a4,a5,.L3 .L10: beq a2,zero,.L1 vsetvli a5,a4,e32,m1,ta,ma addi a4,a4,-1 vle32.v v1,0(a0) vse32.v v1,0(a1) addiw a2,a2,-1 bltu a4,a5,.L10 .L3: addiw a2,a2,-1 addi a4,a4,-1 bne a2,a3,.L5 .L1: ret PR target/109743 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::get_vsetvl_at_end): New. (local_avl_compatible_p): New. (pass_vsetvl::local_eliminate_vsetvl_insn): Enhance local optimizations for LCM, rewrite as a backward algorithm. (pass_vsetvl::cleanup_insns): Use new local_eliminate_vsetvl_insn interface, handle a BB at once. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr109743-1.c: New test. * gcc.target/riscv/rvv/vsetvl/pr109743-2.c: New test. * gcc.target/riscv/rvv/vsetvl/pr109743-3.c: New test. * gcc.target/riscv/rvv/vsetvl/pr109743-4.c: New test. Co-authored-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
2023-05-12tree-optimization/64731 - extend store-from CTOR lowering to TARGET_MEM_REFRichard Biener2-17/+38
The following also covers TARGET_MEM_REF when decomposing stores from CTORs to supported elementwise operations. This avoids spilling and cleans up after vector lowering which doesn't touch loads or stores. It also mimics what we already do for loads. PR tree-optimization/64731 * tree-ssa-forwprop.cc (pass_forwprop::execute): Also handle TARGET_MEM_REF destinations of stores from vector CTORs. * gcc.target/i386/pr64731.c: New testcase.
2023-05-12c++: remove redundant testcase [PR83258]Patrick Palka2-9/+1
I noticed only after the fact that the new testcase template/function2.C (from r14-708-gc3afdb8ba8f183) is just a subset of ext/visibility/anon8.C, so let's get rid of it. PR c++/83258 gcc/testsuite/ChangeLog: * g++.dg/ext/visibility/anon8.C: Mention PR83258. * g++.dg/template/function2.C: Removed.
2023-05-12c++: robustify testcase [PR109752]Patrick Palka2-26/+13
This rewrites the testcase for PR109752 to make it simpler and more robust (i.e. no longer dependent on r13-4035-gc41bbfcaf9d6ef). PR c++/109752 gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-pr109752.C: Rename to ... * g++.dg/cpp2a/concepts-complete4.C: ... this. Rewrite.
2023-05-12tree-optimization/109791 - simplify (unsigned)&foo - (unsigned)(&foo + o)Richard Biener1-0/+12
The following adds another variant of address difference simplification. The utility ptr_difference_const only handles constant differences (we also cannot code generate anything else), so exposing a possible POINTER_PLUS_EXPR in the match and computing the difference on the base only makes it possible to handle one case of a variable offset. This simplifies (unsigned long) &MEM <char[3]> [(void *)&str + 2B] - (unsigned long) (&str + (_69 + 1)) down to (1 - (unsigned long) _69) during niter analysis, allowing ranger to eliminate a condition later and avoiding a bogus -Wstringop-overflow diagnostic for the testcase in the PR. PR tree-optimization/109791 * match.pd (minus (convert ADDR_EXPR@0) (convert (pointer_plus @1 @2))): New pattern. (minus (convert (pointer_plus @1 @2)) (convert ADDR_EXPR@0)): Likewise.
2023-05-12arm: [MVE intrinsics] rework vsriqChristophe Lyon5-213/+5
Implement vsriq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vsriq): New. * config/arm/arm-mve-builtins-base.def (vsriq): New. * config/arm/arm-mve-builtins-base.h (vsriq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vsriq. * config/arm/arm_mve.h (vsriq): Remove. (vsriq_m): Remove. (vsriq_n_u8): Remove. (vsriq_n_s8): Remove. (vsriq_n_u16): Remove. (vsriq_n_s16): Remove. (vsriq_n_u32): Remove. (vsriq_n_s32): Remove. (vsriq_m_n_s8): Remove. (vsriq_m_n_u8): Remove. (vsriq_m_n_s16): Remove. (vsriq_m_n_u16): Remove. (vsriq_m_n_s32): Remove. (vsriq_m_n_u32): Remove. (__arm_vsriq_n_u8): Remove. (__arm_vsriq_n_s8): Remove. (__arm_vsriq_n_u16): Remove. (__arm_vsriq_n_s16): Remove. (__arm_vsriq_n_u32): Remove. (__arm_vsriq_n_s32): Remove. (__arm_vsriq_m_n_s8): Remove. (__arm_vsriq_m_n_u8): Remove. (__arm_vsriq_m_n_s16): Remove. (__arm_vsriq_m_n_u16): Remove. (__arm_vsriq_m_n_s32): Remove. (__arm_vsriq_m_n_u32): Remove. (__arm_vsriq): Remove. (__arm_vsriq_m): Remove.
2023-05-12arm: [MVE intrinsics] factorize vsriqChristophe Lyon2-4/+6
Factorize vsriq builtins so that they use parameterized names. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (mve_insn): Add vsri. * config/arm/mve.md (mve_vsriq_n_<supf><mode>): Rename into ... (@mve_<mve_insn>q_n_<supf><mode>): .,. this. (mve_vsriq_m_n_<supf><mode>): Rename into ... (@mve_<mve_insn>q_m_n_<supf><mode>): ... this.
2023-05-12arm: [MVE intrinsics] add ternary_rshift shapeChristophe Lyon2-0/+39
This patch adds the ternary_rshift shape description. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (ternary_rshift): New. * config/arm/arm-mve-builtins-shapes.h (ternary_rshift): New.
2023-05-12arm: [MVE intrinsics] rework vsliqChristophe Lyon5-213/+5
Implement vsliq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vsliq): New. * config/arm/arm-mve-builtins-base.def (vsliq): New. * config/arm/arm-mve-builtins-base.h (vsliq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vsliq. * config/arm/arm_mve.h (vsliq): Remove. (vsliq_m): Remove. (vsliq_n_u8): Remove. (vsliq_n_s8): Remove. (vsliq_n_u16): Remove. (vsliq_n_s16): Remove. (vsliq_n_u32): Remove. (vsliq_n_s32): Remove. (vsliq_m_n_s8): Remove. (vsliq_m_n_s32): Remove. (vsliq_m_n_s16): Remove. (vsliq_m_n_u8): Remove. (vsliq_m_n_u32): Remove. (vsliq_m_n_u16): Remove. (__arm_vsliq_n_u8): Remove. (__arm_vsliq_n_s8): Remove. (__arm_vsliq_n_u16): Remove. (__arm_vsliq_n_s16): Remove. (__arm_vsliq_n_u32): Remove. (__arm_vsliq_n_s32): Remove. (__arm_vsliq_m_n_s8): Remove. (__arm_vsliq_m_n_s32): Remove. (__arm_vsliq_m_n_s16): Remove. (__arm_vsliq_m_n_u8): Remove. (__arm_vsliq_m_n_u32): Remove. (__arm_vsliq_m_n_u16): Remove. (__arm_vsliq): Remove. (__arm_vsliq_m): Remove.
2023-05-12arm: [MVE intrinsics] factorize vsliqChristophe Lyon2-4/+6
Factorize vsliq builtins so that they use parameterized names. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (mve_insn>): Add vsli. * config/arm/mve.md (mve_vsliq_n_<supf><mode>): Rename into ... (@mve_<mve_insn>q_n_<supf><mode>): ... this. (mve_vsliq_m_n_<supf><mode>): Rename into ... (@mve_<mve_insn>q_m_n_<supf><mode>): ... this.
2023-05-12arm: [MVE intrinsics] add ternary_lshift shapeChristophe Lyon2-0/+39
This patch adds the ternary_lshift shape description. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (ternary_lshift): New. * config/arm/arm-mve-builtins-shapes.h (ternary_lshift): New.
2023-05-12arm: [MVE intrinsics] rework vpselqChristophe Lyon4-177/+4
Implement vpselq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vpselq): New. * config/arm/arm-mve-builtins-base.def (vpselq): New. * config/arm/arm-mve-builtins-base.h (vpselq): New. * config/arm/arm_mve.h (vpselq): Remove. (vpselq_u8): Remove. (vpselq_s8): Remove. (vpselq_u16): Remove. (vpselq_s16): Remove. (vpselq_u32): Remove. (vpselq_s32): Remove. (vpselq_u64): Remove. (vpselq_s64): Remove. (vpselq_f16): Remove. (vpselq_f32): Remove. (__arm_vpselq_u8): Remove. (__arm_vpselq_s8): Remove. (__arm_vpselq_u16): Remove. (__arm_vpselq_s16): Remove. (__arm_vpselq_u32): Remove. (__arm_vpselq_s32): Remove. (__arm_vpselq_u64): Remove. (__arm_vpselq_s64): Remove. (__arm_vpselq_f16): Remove. (__arm_vpselq_f32): Remove. (__arm_vpselq): Remove.
2023-05-12arm: [MVE intrinsics] add vpsel shapeChristophe Lyon2-0/+40
This patch adds the vpsel shape description. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (vpsel): New. * config/arm/arm-mve-builtins-shapes.h (vpsel): New.
2023-05-12arm: [MVE intrinsics] factorize vpselqChristophe Lyon3-13/+18
Factorize vpselq builtins so that they use parameterized names. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm.cc (arm_expand_vcond): Use gen_mve_q instead of gen_mve_vpselq. * config/arm/iterators.md (MVE_VPSELQ_F): New. (mve_insn): Add vpsel. * config/arm/mve.md (@mve_vpselq_<supf><mode>): Rename into ... (@mve_<mve_insn>q_<supf><mode>): ... this. (@mve_vpselq_f<mode>): Rename into ... (@mve_<mve_insn>q_f<mode>): ... this.
2023-05-12arm: [MVE intrinsics] rework vfmaq vfmasq vfmsqChristophe Lyon5-292/+12
Implement vfmaq, vfmasq, vfmsq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vfmaq, vfmasq, vfmsq): New. * config/arm/arm-mve-builtins-base.def (vfmaq, vfmasq, vfmsq): New. * config/arm/arm-mve-builtins-base.h (vfmaq, vfmasq, vfmsq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vfmaq, vfmasq, vfmsq. * config/arm/arm_mve.h (vfmaq): Remove. (vfmasq): Remove. (vfmsq): Remove. (vfmaq_m): Remove. (vfmasq_m): Remove. (vfmsq_m): Remove. (vfmaq_f16): Remove. (vfmaq_n_f16): Remove. (vfmasq_n_f16): Remove. (vfmsq_f16): Remove. (vfmaq_f32): Remove. (vfmaq_n_f32): Remove. (vfmasq_n_f32): Remove. (vfmsq_f32): Remove. (vfmaq_m_f32): Remove. (vfmaq_m_f16): Remove. (vfmaq_m_n_f32): Remove. (vfmaq_m_n_f16): Remove. (vfmasq_m_n_f32): Remove. (vfmasq_m_n_f16): Remove. (vfmsq_m_f32): Remove. (vfmsq_m_f16): Remove. (__arm_vfmaq_f16): Remove. (__arm_vfmaq_n_f16): Remove. (__arm_vfmasq_n_f16): Remove. (__arm_vfmsq_f16): Remove. (__arm_vfmaq_f32): Remove. (__arm_vfmaq_n_f32): Remove. (__arm_vfmasq_n_f32): Remove. (__arm_vfmsq_f32): Remove. (__arm_vfmaq_m_f32): Remove. (__arm_vfmaq_m_f16): Remove. (__arm_vfmaq_m_n_f32): Remove. (__arm_vfmaq_m_n_f16): Remove. (__arm_vfmasq_m_n_f32): Remove. (__arm_vfmasq_m_n_f16): Remove. (__arm_vfmsq_m_f32): Remove. (__arm_vfmsq_m_f16): Remove. (__arm_vfmaq): Remove. (__arm_vfmasq): Remove. (__arm_vfmsq): Remove. (__arm_vfmaq_m): Remove. (__arm_vfmasq_m): Remove. (__arm_vfmsq_m): Remove.
2023-05-12arm: [MVE intrinsics] factorize vfmaq vfmsq vfmasqChristophe Lyon2-108/+35
Factorize vmvnq builtins so that they use parameterized names. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (MVE_FP_M_BINARY): Add VFMAQ_M_F, VFMSQ_M_F. (MVE_FP_M_N_BINARY): Add VFMAQ_M_N_F, VFMASQ_M_N_F. (MVE_VFMxQ_F, MVE_VFMAxQ_N_F): New. (mve_insn): Add vfma, vfmas, vfms. * config/arm/mve.md (mve_vfmaq_f<mode>, mve_vfmsq_f<mode>): Merge into ... (@mve_<mve_insn>q_f<mode>): ... this. (mve_vfmaq_n_f<mode>, mve_vfmasq_n_f<mode>): Merge into ... (@mve_<mve_insn>q_n_f<mode>): ... this. (mve_vfmaq_m_f<mode>, mve_vfmsq_m_f<mode>): Merge into @mve_<mve_insn>q_m_f<mode>. (mve_vfmaq_m_n_f<mode>, mve_vfmasq_m_n_f<mode>): Merge into @mve_<mve_insn>q_m_n_f<mode>.
2023-05-12arm: [MVE intrinsics] add ternary_opt_n shapeChristophe Lyon2-0/+31
This patch adds the ternary_opt_n shape description. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (ternary_opt_n): New. * config/arm/arm-mve-builtins-shapes.h (ternary_opt_n): New.
2023-05-12arm: [MVE intrinsics] rework vmvnqChristophe Lyon4-438/+12
Implement vmvnq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (FUNCTION_WITH_RTX_M_N_NO_F): New. (vmvnq): New. * config/arm/arm-mve-builtins-base.def (vmvnq): New. * config/arm/arm-mve-builtins-base.h (vmvnq): New. * config/arm/arm_mve.h (vmvnq): Remove. (vmvnq_m): Remove. (vmvnq_x): Remove. (vmvnq_s8): Remove. (vmvnq_s16): Remove. (vmvnq_s32): Remove. (vmvnq_n_s16): Remove. (vmvnq_n_s32): Remove. (vmvnq_u8): Remove. (vmvnq_u16): Remove. (vmvnq_u32): Remove. (vmvnq_n_u16): Remove. (vmvnq_n_u32): Remove. (vmvnq_m_u8): Remove. (vmvnq_m_s8): Remove. (vmvnq_m_u16): Remove. (vmvnq_m_s16): Remove. (vmvnq_m_u32): Remove. (vmvnq_m_s32): Remove. (vmvnq_m_n_s16): Remove. (vmvnq_m_n_u16): Remove. (vmvnq_m_n_s32): Remove. (vmvnq_m_n_u32): Remove. (vmvnq_x_s8): Remove. (vmvnq_x_s16): Remove. (vmvnq_x_s32): Remove. (vmvnq_x_u8): Remove. (vmvnq_x_u16): Remove. (vmvnq_x_u32): Remove. (vmvnq_x_n_s16): Remove. (vmvnq_x_n_s32): Remove. (vmvnq_x_n_u16): Remove. (vmvnq_x_n_u32): Remove. (__arm_vmvnq_s8): Remove. (__arm_vmvnq_s16): Remove. (__arm_vmvnq_s32): Remove. (__arm_vmvnq_n_s16): Remove. (__arm_vmvnq_n_s32): Remove. (__arm_vmvnq_u8): Remove. (__arm_vmvnq_u16): Remove. (__arm_vmvnq_u32): Remove. (__arm_vmvnq_n_u16): Remove. (__arm_vmvnq_n_u32): Remove. (__arm_vmvnq_m_u8): Remove. (__arm_vmvnq_m_s8): Remove. (__arm_vmvnq_m_u16): Remove. (__arm_vmvnq_m_s16): Remove. (__arm_vmvnq_m_u32): Remove. (__arm_vmvnq_m_s32): Remove. (__arm_vmvnq_m_n_s16): Remove. (__arm_vmvnq_m_n_u16): Remove. (__arm_vmvnq_m_n_s32): Remove. (__arm_vmvnq_m_n_u32): Remove. (__arm_vmvnq_x_s8): Remove. (__arm_vmvnq_x_s16): Remove. (__arm_vmvnq_x_s32): Remove. (__arm_vmvnq_x_u8): Remove. (__arm_vmvnq_x_u16): Remove. (__arm_vmvnq_x_u32): Remove. (__arm_vmvnq_x_n_s16): Remove. (__arm_vmvnq_x_n_s32): Remove. (__arm_vmvnq_x_n_u16): Remove. (__arm_vmvnq_x_n_u32): Remove. (__arm_vmvnq): Remove. (__arm_vmvnq_m): Remove. (__arm_vmvnq_x): Remove.