riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2023-05-15	ada: Fix link to parent when copying with Copy_Separate_Tree	Piotr Trojanek	3	-61/+71
	When flag More_Ids is set on a node, then syntactic children will have their Parent link set to the last node in the chain of Mode_Ids. For example, parameter associations in declaration like: procedure P (X, Y : T); will have More_Ids set for "X", Prev_Ids set on "Y" and both will have the same node of "T" as their child. However, "T" will have only one parent, i.e. "Y". This anomaly was taken into account in New_Copy_Tree, but not in Copy_Separate_Tree. This was leading to spurious errors in check for ghost-correctness applied to copied specs. gcc/ada/ * atree.ads (Is_Syntactic_Node): Refactored from New_Copy_Tree. * atree.adb (Is_Syntactic_Node): Likewise. (Copy_Separate_Tree): Use Is_Syntactic_Node. * sem_util.adb (Has_More_Ids): Move to Atree. (Is_Syntactic_Node): Likewise.
2023-05-15	aarch64: PR target/99195 annotate vector compare patterns for vec-concat-zero	Kyrylo Tkachov	2	-7/+103
	This instalment of the series goes through the vector comparison patterns in the backend. One wart are the int64x1_t comparisons that this patch doesn't touch. Those are a bit trickier because they have define_insn_and_split mechanisms for falling back to GP reg comparisons after reload and I don't think a simple annotation will catch those cases correctly. Those will need more custom thinking. As said, this patch doesn't touch those and is a decent straightforward improvement on its own. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: PR target/99195 * config/aarch64/aarch64-simd.md (aarch64_cm<optab><mode>): Rename to... (aarch64_cm<optab><mode><vczle><vczbe>): ... This. (aarch64_cmtst<mode>): Rename to... (aarch64_cmtst<mode><vczle><vczbe>): ... This. (aarch64_cmtst_same_<mode>): Rename to... (aarch64_cmtst_same_<mode><vczle><vczbe>): ... This. (aarch64_cmtstdi): Rename to... (aarch64_cmtstdi<vczle><vczbe>): ... This. (aarch64_fac<optab><mode>): Rename to... (aarch64_fac<optab><mode><vczle><vczbe>): ... This. gcc/testsuite/ChangeLog: PR target/99195 * gcc.target/aarch64/simd/pr99195_7.c: New test.
2023-05-15	aarch64: PR target/99195 annotate qabs,qneg patterns for vec-concat-zero	Kyrylo Tkachov	2	-1/+11
	Straightforward like previous patches in this series. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: PR target/99195 * config/aarch64/aarch64-simd.md (aarch64_s<optab><mode>): Rename to... (aarch64_s<optab><mode><vczle><vczbe>): ... This. gcc/testsuite/ChangeLog: PR target/99195 * gcc.target/aarch64/simd/pr99195_4.c: Add testing for qabs, qneg.
2023-05-15	RISC-V: Optimize vsetvl AVL for VLS VLMAX auto-vectorization	Pan Li	2	-3/+38
	This patch is optimizing the AVL for VLS auto-vectorzation. Given below sample code: typedef int8_t vnx2qi __attribute__ ((vector_size (2))); __attribute__ ((noipa)) void f_vnx2qi (int8_t a, int8_t b, int8_t out) { vnx2qi v = {a, b}; (vnx2qi ) out = v; } Before this patch: f_vnx2qi: vsetvli a5,zero,e8,mf8,ta,ma vmv.v.x v1,a0 vslide1down.vx v1,v1,a1 vse8.v v1,0(a2) ret After this patch: f_vnx2qi: vsetivli zero,2,e8,mf8,ta,ma vmv.v.x v1,a0 vslide1down.vx v1,v1,a1 vse8.v v1,0(a2) ret Signed-off-by: Pan Li <pan2.li@intel.com> Co-authored-by: Juzhe-Zhong <juzhe.zhong@rivai.ai> Co-authored-by: kito-cheng <kito.cheng@sifive.com> gcc/ChangeLog: config/riscv/riscv-v.cc (const_vlmax_p): New function for deciding the mode is constant or not. (set_len_and_policy): Optimize VLS-VLMAX code gen to vsetivli. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vf_avl-1.c: New test.
2023-05-15	tree-optimization/109848 - fix TARGET_MEM_REF store from CTOR simplification	Richard Biener	1	-1/+4
	I've put the preparation stmt in the wrong place. PR tree-optimization/109848 * tree-ssa-forwprop.cc (pass_forwprop::execute): Put the TARGET_MEM_REF address preparation before the store, not before the CTOR.
2023-05-15	Fix gcc.dg/vect/pr108950.c	Richard Biener	1	-1/+1
	The following puts the dg-require-effective-target properly after the dg-do. * gcc.dg/vect/pr108950.c: Re-order dg-require-effective-target and dg-do.
2023-05-15	RISC-V: Support TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT to optimize ↵	Juzhe-Zhong	4	-4/+44
	codegen of both VLA && VLS auto-vectorization This patch optimizes both RVV VLA && VLS vectorization. Consider this following case: void __attribute__((noinline, noclone)) f (int * __restrict dst, int * __restrict op1, int * __restrict op2, int count) { for (int i = 0; i < count; ++i) dst[i] = op1[i] + op2[i]; } VLA: Before this patch: ble a3,zero,.L1 srli a4,a1,2 negw a4,a4 andi a5,a4,3 sext.w a3,a3 beq a5,zero,.L3 lw a7,0(a1) lw a6,0(a2) andi a4,a4,2 addw a6,a6,a7 sw a6,0(a0) beq a4,zero,.L3 lw a7,4(a1) lw a4,4(a2) li a6,3 addw a4,a4,a7 sw a4,4(a0) bne a5,a6,.L3 lw a6,8(a2) lw a4,8(a1) addw a4,a4,a6 sw a4,8(a0) .L3: subw a3,a3,a5 slli a4,a3,32 csrr a6,vlenb srli a4,a4,32 srli a6,a6,2 slli a3,a5,2 mv a5,a4 bgtu a4,a6,.L17 .L5: csrr a6,vlenb add a1,a1,a3 add a2,a2,a3 add a0,a0,a3 srli a7,a6,2 li a3,0 .L8: vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1) vle32.v v2,0(a2) vsetvli t1,zero,e32,m1,ta,ma add a3,a3,a7 vadd.vv v1,v1,v2 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) mv a5,a4 bleu a4,a3,.L6 mv a5,a3 .L6: sub a5,a4,a5 bleu a5,a7,.L7 mv a5,a7 .L7: add a1,a1,a6 add a2,a2,a6 add a0,a0,a6 bne a5,zero,.L8 .L1: ret .L17: mv a5,a6 j .L5 After this patch: f: ble a3,zero,.L1 csrr a4,vlenb srli a4,a4,2 mv a5,a3 bgtu a3,a4,.L9 .L3: csrr a6,vlenb li a4,0 srli a7,a6,2 .L6: vsetvli zero,a5,e32,m1,ta,ma vle32.v v2,0(a1) vle32.v v1,0(a2) vsetvli t1,zero,e32,m1,ta,ma add a4,a4,a7 vadd.vv v1,v1,v2 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) mv a5,a3 bleu a3,a4,.L4 mv a5,a4 .L4: sub a5,a3,a5 bleu a5,a7,.L5 mv a5,a7 .L5: add a0,a0,a6 add a2,a2,a6 add a1,a1,a6 bne a5,zero,.L6 .L1: ret .L9: mv a5,a4 j .L3 VLS: Before this patch: f3: ble a3,zero,.L1 srli a5,a1,2 negw a5,a5 andi a4,a5,3 sext.w a3,a3 beq a4,zero,.L3 lw a7,0(a1) lw a6,0(a2) andi a5,a5,2 addw a6,a6,a7 sw a6,0(a0) beq a5,zero,.L3 lw a7,4(a1) lw a5,4(a2) li a6,3 addw a5,a5,a7 sw a5,4(a0) bne a4,a6,.L3 lw a6,8(a2) lw a5,8(a1) addw a5,a5,a6 sw a5,8(a0) .L3: subw a3,a3,a4 slli a6,a4,2 slli a5,a3,32 srli a5,a5,32 add a1,a1,a6 add a2,a2,a6 add a0,a0,a6 li a3,4 .L6: mv a4,a5 bleu a5,a3,.L5 li a4,4 .L5: vsetvli zero,a4,e32,m1,ta,ma vle32.v v1,0(a1) vle32.v v2,0(a2) vsetivli zero,4,e32,m1,ta,ma sub a5,a5,a4 vadd.vv v1,v1,v2 vsetvli zero,a4,e32,m1,ta,ma vse32.v v1,0(a0) addi a1,a1,16 addi a2,a2,16 addi a0,a0,16 bne a5,zero,.L6 .L1: ret After this patch: f3: ble a3,zero,.L1 li a4,4 .L4: mv a5,a3 bleu a3,a4,.L3 li a5,4 .L3: vsetvli zero,a5,e32,m1,ta,ma vle32.v v2,0(a1) vle32.v v1,0(a2) vsetivli zero,4,e32,m1,ta,ma sub a3,a3,a5 vadd.vv v1,v1,v2 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) addi a2,a2,16 addi a0,a0,16 addi a1,a1,16 bne a3,zero,.L4 .L1: ret Signed-off-by: Juzhe-Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/riscv.cc (riscv_vectorize_preferred_vector_alignment): New function. (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New target hook. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/shift-rv32gcv.c: Adapt testcase. * gcc.target/riscv/rvv/autovec/align-1.c: New test. * gcc.target/riscv/rvv/autovec/align-2.c: New test.
2023-05-15	Daily bump.	GCC Administrator	3	-1/+35

2023-05-14	MATCH: Add pattern for `signbit(x) ? x : -x` into abs (and swapped)	Andrew Pinski	3	-0/+36
	This adds a simple pattern to match.pd for `signbit(x) ? x : -x` into abs<x>. This can be done for all types even ones that honor signed zeros and NaNs because both signbit and - are considered only looking at/touching the sign bit of those types and does not trap either. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/109829 gcc/ChangeLog: * match.pd: Add pattern for `signbit(x) !=/== 0 ? x : -x`. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/abs-3.c: New test. * gcc.dg/tree-ssa/abs-4.c: New test.
2023-05-14	i386: Handle unsupported modes from ix86_widen_mult_cost [PR109807]	Uros Bizjak	2	-3/+6
	Revert my previous change that faked handling of V4HI and V2SImodes in ix86_widen_mult_cost and rather return arbitrary high value for unsupported modes. This should prevent cost estimator from selecting non-existent vector widen multiply operation. gcc/ChangeLog: PR target/109807 * config/i386/i386.cc: Revert the 2023-05-11 change. (ix86_widen_mult_cost): Return high value instead of ICEing for unsupported modes. gcc/testsuite/ChangeLog: PR target/109807 * gcc.target/i386/pr109825.c: New test.
2023-05-14	i386: Honour -mdirect-extern-access when calling __fentry__	Ard Biesheuvel	1	-2/+6
	The small and medium PIC code models generate profiling calls that always load the address of __fentry__() via the GOT, even if -mdirect-extern-access is in effect. This deviates from the behavior with respect to other external references, and results in a longer opcode that relies on linker relaxation to eliminate the GOT load. In this particular case, the transformation replaces an indirect 'CALL __fentry__@GOTPCREL(%rip)' with either 'CALL __fentry__; NOP' or 'NOP; CALL __fentry__', where the NOP is a 1 byte NOP that preserves the 6 byte length of the sequence. This is problematic for the Linux kernel, which generally relies on -mdirect-extern-access and hidden visibility to eliminate GOT based symbol references in code generated with -fpie/-fpic, without having to depend on linker relaxation. The Linux kernel relies on code patching to replace these opcodes with NOPs at runtime, and this is complicated code that we'd prefer not to complicate even more by adding support for patching both 5 and 6 byte sequences as well as parsing the instruction stream to decide which variant of CALL+NOP we are dealing with. So let's honour -mdirect-extern-access, and only load the address of __fentry__ via the GOT if direct references to external symbols are not permitted. Note that the GOT reference in question is in fact a data reference: we explicitly load the address of __fentry__ from the GOT, which amounts to eager binding, rather than emitting a PLT call that could bind eagerly, lazily or directly at link time. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> gcc/ChangeLog: config/i386/i386.cc (x86_function_profiler): Take ix86_direct_extern_access into account when generating calls to __fentry__()
2023-05-14	RISC-V: Refactor the or pattern to switch cases	Pan Li	1	-11/+26
	This patch refactor the pattern A or B or C or D, to the switch case for easy add/remove new types, as well as human reading friendly. Before this patch: return A \|\| B \|\| C \|\| D; After this patch: switch (type) { case A: case B: case C: case D: return true; default: return false; } Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (required_extensions_p): Refactor the or pattern to switch cases.
2023-05-14	Daily bump.	GCC Administrator	4	-1/+64

2023-05-13	Replace bool as boolean instead of int in libgm2	Gaius Mulley	4	-42/+14
	This patch tidies KeyBoardLEDs.cc, RTco.cc, sckt.cc and wrapc.cc by removing the TRUE/FALSE macros and using bool, true and false. libgm2/ChangeLog: * libm2cor/KeyBoardLEDs.cc (TRUE): Remove. (FALSE): Remove. (init): Replace TRUE with true. * libm2iso/RTco.cc (TRUE): Remove. (FALSE): Remove. (initSem): Replace int with bool. (init): Replace FALSE with false. * libm2pim/sckt.cc (TRUE): Remove. (FALSE): Remove. * libm2pim/wrapc.cc: Replace TRUE with true and FALSE with false. (FALSE): Remove. (TRUE): Remove. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-05-13	[aarch64] Recursively intialize even and odd sub-parts and merge with zip1.	Prathamesh Kulkarni	12	-81/+180
	gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback): Rename aarch64_expand_vector_init to this, and remove interleaving case. Recursively call aarch64_expand_vector_init_fallback, instead of aarch64_expand_vector_init. (aarch64_unzip_vector_init): New function. (aarch64_expand_vector_init): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldp_stp_16.c (cons2_8_float): Adjust for new code-gen. * gcc.target/aarch64/sve/acle/general/dupq_5.c: Likewise. * gcc.target/aarch64/sve/acle/general/dupq_6.c: Likewise. * gcc.target/aarch64/interleave-init-1.c: Rename to ... * gcc.target/aarch64/vec-init-18.c: ... this. * gcc.target/aarch64/vec-init-19.c: New test. * gcc.target/aarch64/vec-init-20.c: Likewise. * gcc.target/aarch64/vec-init-21.c: Likewise. * gcc.target/aarch64/vec-init-22-size.c: Likewise. * gcc.target/aarch64/vec-init-22-speed.c: Likewise. * gcc.target/aarch64/vec-init-22.h: New header.
2023-05-13	RISC-V: Pull out function call with side effect from gcc_assert.	Kito Cheng	1	-1/+2
	It will broken when release mode. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::cleanup_insns): Pull out function call from the gcc_assert.
2023-05-13	RISC-V: Improve vector_insn_info::dump for LMUL and policy	Kito Cheng	1	-3/+36
	Convert vlmul and policy to human readable string, some example below: Before: [VALID,Demand field={1(VL),0(DEMAND_NONZERO_AVL),1(SEW),0(DEMAND_GE_SEW),1(LMUL),0(RATIO),0(TAIL_POLICY),0(MASK_POLICY)} AVL=(reg:DI 0 zero) SEW=16,VLMUL=3,RATIO=2,TAIL_POLICY=1,MASK_POLICY=1] ^ ^ ^ After: [VALID,Demand field={1(VL),0(DEMAND_NONZERO_AVL),1(SEW),0(DEMAND_GE_SEW),1(LMUL),0(RATIO),0(TAIL_POLICY),0(MASK_POLICY)} AVL=(reg:DI 0 zero) SEW=16,VLMUL=m8,RATIO=2,TAIL_POLICY=agnostic,MASK_POLICY=agnostic] ^^ ^^^^^^^^ ^^^^^^^^ gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (vlmul_to_str): New. (policy_to_str): New. (vector_insn_info::dump): Use vlmul_to_str and policy_to_str.
2023-05-12	MATCH: Fix PR 109834, ICE with popcount combined with bswap	Andrew Pinski	3	-2/+17
	After r14-673-gc0dd80e4c4c3, there was a check in the match patterns which was checking the type is unsigned but instead of using the type, the patch used the expression. This adds the needed TREE_TYPE so get the correct answer and don't ICE. Committed as obvious after a bootstrap/test on x86_64-linux-gnu. PR tree-optimization/109834 gcc/ChangeLog: * match.pd (popcount(bswap(x))->popcount(x)): Fix up unsigned type checking. (popcount(rotate(x,y))->popcount(x)): Likewise. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/pr109834-1.c: New test. * gcc.dg/tree-ssa/pr109834-1.c: New test.
2023-05-13	Daily bump.	GCC Administrator	7	-1/+1042

2023-05-12	Fortran: Revise a namelist test case.	Jerry DeLisle	1	-4/+17
	PR fortran/109662 gcc/testsuite/ChangeLog: * gfortran.dg/pr109662-a.f90: Add a section to verify that a short namelist read does not modify the variable.
2023-05-12	Fortran: Initialize last_char for internal units.	Jerry DeLisle	1	-0/+1
	PR fortran/109662 libgfortran/ChangeLog: * io/unit.c (set_internal_unit): Set the internal unit last_char to zero so that previous EOF characters do not influence the next read.
2023-05-12	i386: Cleanup ix86_expand_vecop_qihi{,2}	Uros Bizjak	1	-27/+37
	Some cleanups while looking at these two functions. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_vecop_qihi2): Also reject ymm instructions for TARGET_PREFER_AVX128. Use generic gen_extend_insn to generate zero/sign extension instructions. Fix comments. (ix86_expand_vecop_qihi): Initialize interleave functions for MULT code only. Fix comments.
2023-05-12	libstdc++: Fix -Wnonnull warnings during configure	Jonathan Wakely	2	-8/+8
	We should not test for nan by passing it a null pointer, as this can trigger -Wnonnull warnings. Also fix an outdated comment about the default -std mode. libstdc++-v3/ChangeLog: * acinclude.m4 (GLIBCXX_CHECK_C99_TR1): Use a non-null pointer to check for nan, nanf, and nanl. * configure: Regenerate.
2023-05-12	libstdc++: Remove redundant dependencies on _GLIBCXX_USE_C99_STDINT_TR1	Jonathan Wakely	2	-12/+2
	We never need to use std::make_unsigned in std::char_traits<char16_t> and std::char_traits<char32_t> because <cstdint> guarantees to provide the types we need, since r9-2028-g8ba7f29e3dd064. Similarly, experimental::source_location can just assume uint_least32_t is defined by <cstdint>. libstdc++-v3/ChangeLog: * include/bits/char_traits.h (char_traits<char16_t>): Do not depend on _GLIBCXX_USE_C99_STDINT_TR1. (char_traits<char32_t>): Likewise. * include/experimental/source_location: Likewise.
2023-05-12	libstdc++: Reduce <atomic> dependency on _GLIBCXX_USE_C99_STDINT_TR1	Jonathan Wakely	2	-7/+2
	Since r9-2028-g8ba7f29e3dd064 we've defined most of <cstdint> unconditionally, so we can do the same for most of the std::atomic aliases such as std::atomic_int_least32_t. The only aliases that need to depend on _GLIBCXX_USE_C99_STDINT_TR1 are the ones for the integer types that are not guaranteed to be defined, e.g. std::atomic_int32_t. libstdc++-v3/ChangeLog: * include/std/atomic (atomic_int_least8_t, atomic_uint_least8_t) (atomic_int_least16_t, atomic_uint_least16_t) (atomic_int_least32_t, atomic_uint_least32_t) (atomic_int_least64_t, atomic_uint_least64_t) (atomic_int_fast16_t, atomic_uint_fast16_t) (atomic_int_fast32_t, atomic_uint_fast32_t) (atomic_int_fast64_t, atomic_uint_fast64_t) (atomic_intmax_t, atomic_uintmax_t): Define unconditionally. * testsuite/29_atomics/headers/stdatomic.h/c_compat.cc: Adjust.
2023-05-12	libstdc++: Remove <random> dependency on _GLIBCXX_USE_C99_STDINT_TR1	Jonathan Wakely	8	-27/+4
	Since r9-2028-g8ba7f29e3dd064 we've defined most of <cstdint> unconditionally, including uint_least32_t. This means that all of <random> can be defined unconditionally, which means that std::shuffle and std::ranges::shuffle can be too. libstdc++-v3/ChangeLog: * include/bits/algorithmfwd.h (shuffle): Do not depend on _GLIBCXX_USE_C99_STDINT_TR1. * include/bits/ranges_algo.h (shuffle): Likewise. * include/bits/stl_algo.h (shuffle): Likewise. * include/ext/random: Likewise. * include/ext/throw_allocator.h (random_condition): Likewise. * include/std/random: Likewise. * src/c++11/cow-string-inst.cc: Likewise. * src/c++11/random.cc: Likewise.
2023-05-12	PR modula2/109830 m2iso library SeqFile.mod appending to a file overwrites ↵	Gaius Mulley	2	-21/+101
	content This patch is for the m2iso library SeqFile.mod to fix a bug when a file is opened using OpenAppend. The patch checks to see if the file exists and it uses FIO.OpenForRandom to ensure the file is not overwritten. gcc/m2/ChangeLog: PR modula2/109830 * gm2-libs-iso/SeqFile.mod (newCid): New parameter toAppend used to select FIO.OpenForRandom. (OpenRead): Pass extra parameter to newCid. (OpenWrite): Pass extra parameter to newCid. (OpenAppend): Pass extra parameter to newCid. gcc/testsuite/ChangeLog: PR modula2/109830 * gm2/isolib/run/pass/seqappend.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-05-12	i386: Remove mulv2si emulated sequence for TARGET_SSE2 [PR109797]	Uros Bizjak	1	-33/+1
	Remove mulv2si emulated sequence for TARGET_SSE2 and enable only native PMULLD instruction for TARGET_SSE4_1. Ideally, the vectorization for TARGET_SSE2 should depend on more precise cost estimation (the PR contains patch for ix86_multiplication_cost), but even with patched cost function the runtime regression was not fixed. PR target/109797 gcc/ChangeLog: * config/i386/mmx.md (mulv2si3): Remove expander. (mulv2si3): Rename insn pattern from *mulv2si.
2023-05-12	LTO: Fix writing of toplevel asm with offloading [PR109816]	Tobias Burnus	3	-1/+105
	When offloading was enabled, top-level 'asm' were added to the offloading section, confusing assemblers which did not support the syntax. Additionally, with offloading and -flto, the top-level assembler code did not end up in the host files. As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file, the issue became more apparent, causing fails with nvptx for some C++ testcases. PR libstdc++/109816 gcc/ChangeLog: * lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by '!lto_stream_offload_p'. libgomp/ChangeLog: * testsuite/libgomp.c++/target-map-class-1.C: New test. * testsuite/libgomp.c++/target-map-class-2.C: New test.
2023-05-12	libstdc++: Remove test dependency on _GLIBCXX_USE_C99_STDINT_TR1	Jonathan Wakely	1	-1/+1
	This should have been done in r9-2028-g8ba7f29e3dd064 when std::shared_mutex was changed to be defined without depending on _GLIBCXX_USE_C99_STDINT_TR1. libstdc++-v3/ChangeLog: * testsuite/experimental/feat-cxx14.cc: Remove dependency on _GLIBCXX_USE_C99_STDINT_TR1.
2023-05-12	libstdc++: Remove test dependency on _GLIBCXX_USE_C99_STDINT_TR1	Jonathan Wakely	1	-4/+0
	This should have been removed in r9-2029-g612c9c702e2c9e when the char16_t and char32_t specializations of std::codecvt were changed to be defined unconditionally. libstdc++-v3/ChangeLog: * testsuite/22_locale/locale/cons/unicode.cc: Remove dependency on _GLIBCXX_USE_C99_STDINT_TR1.
2023-05-12	libstdc++: Remove test dependencies on _GLIBCXX_USE_C99_STDINT_TR1	Jonathan Wakely	2	-4/+0
	These #ifdef checks should have been removed in r9-2029-g612c9c702e2c9e when the u16string_view and u32string_view aliases were changed to be defined unconditionally. libstdc++-v3/ChangeLog: * testsuite/21_strings/basic_string_view/typedefs.cc: Remove dependency on _GLIBCXX_USE_C99_STDINT_TR1. * testsuite/experimental/string_view/typedefs.cc: Likewise.
2023-05-12	RISC-V: Optimize vsetvli of LCM INSERTED edge for user vsetvli [PR 109743]	Kito Cheng	5	-45/+277
	Rebase to trunk and send V3 patch for: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617821.html This patch is fixing: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109743. This issue happens is because we are currently very conservative in optimization of user vsetvli. Consider this following case: bb 1: vsetvli a5,a4... (demand AVL = a4). bb 2: RVV insn use a5 (demand AVL = a5). LCM will hoist vsetvl of bb 2 into bb 1. We don't do AVL propagation for this situation since it's complicated that we should analyze the code sequence between vsetvli in bb 1 and RVV insn in bb 2. They are not necessary the consecutive blocks. This patch is doing the optimizations after LCM, we will check and eliminate the vsetvli in LCM inserted edge if such vsetvli is redundant. Such approach is much simplier and safe. code: void foo2 (int32_t a, int32_t b, int n) { if (n <= 0) return; int i = n; size_t vl = __riscv_vsetvl_e32m1 (i); for (; i >= 0; i--) { vint32m1_t v = __riscv_vle32_v_i32m1 (a, vl); __riscv_vse32_v_i32m1 (b, v, vl); if (i >= vl) continue; if (i == 0) return; vl = __riscv_vsetvl_e32m1 (i); } } Before this patch: foo2: .LFB2: .cfi_startproc ble a2,zero,.L1 mv a4,a2 li a3,-1 vsetvli a5,a2,e32,m1,ta,mu vsetvli zero,a5,e32,m1,ta,ma <- can be eliminated. .L5: vle32.v v1,0(a0) vse32.v v1,0(a1) bgeu a4,a5,.L3 .L10: beq a2,zero,.L1 vsetvli a5,a4,e32,m1,ta,mu addi a4,a4,-1 vsetvli zero,a5,e32,m1,ta,ma <- can be eliminated. vle32.v v1,0(a0) vse32.v v1,0(a1) addiw a2,a2,-1 bltu a4,a5,.L10 .L3: addiw a2,a2,-1 addi a4,a4,-1 bne a2,a3,.L5 .L1: ret After this patch: f: ble a2,zero,.L1 mv a4,a2 li a3,-1 vsetvli a5,a2,e32,m1,ta,ma .L5: vle32.v v1,0(a0) vse32.v v1,0(a1) bgeu a4,a5,.L3 .L10: beq a2,zero,.L1 vsetvli a5,a4,e32,m1,ta,ma addi a4,a4,-1 vle32.v v1,0(a0) vse32.v v1,0(a1) addiw a2,a2,-1 bltu a4,a5,.L10 .L3: addiw a2,a2,-1 addi a4,a4,-1 bne a2,a3,.L5 .L1: ret PR target/109743 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::get_vsetvl_at_end): New. (local_avl_compatible_p): New. (pass_vsetvl::local_eliminate_vsetvl_insn): Enhance local optimizations for LCM, rewrite as a backward algorithm. (pass_vsetvl::cleanup_insns): Use new local_eliminate_vsetvl_insn interface, handle a BB at once. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr109743-1.c: New test. * gcc.target/riscv/rvv/vsetvl/pr109743-2.c: New test. * gcc.target/riscv/rvv/vsetvl/pr109743-3.c: New test. * gcc.target/riscv/rvv/vsetvl/pr109743-4.c: New test. Co-authored-by: Juzhe-Zhong <juzhe.zhong@rivai.ai>
2023-05-12	tree-optimization/64731 - extend store-from CTOR lowering to TARGET_MEM_REF	Richard Biener	2	-17/+38
	The following also covers TARGET_MEM_REF when decomposing stores from CTORs to supported elementwise operations. This avoids spilling and cleans up after vector lowering which doesn't touch loads or stores. It also mimics what we already do for loads. PR tree-optimization/64731 * tree-ssa-forwprop.cc (pass_forwprop::execute): Also handle TARGET_MEM_REF destinations of stores from vector CTORs. * gcc.target/i386/pr64731.c: New testcase.
2023-05-12	c++: remove redundant testcase [PR83258]	Patrick Palka	2	-9/+1
	I noticed only after the fact that the new testcase template/function2.C (from r14-708-gc3afdb8ba8f183) is just a subset of ext/visibility/anon8.C, so let's get rid of it. PR c++/83258 gcc/testsuite/ChangeLog: * g++.dg/ext/visibility/anon8.C: Mention PR83258. * g++.dg/template/function2.C: Removed.
2023-05-12	c++: robustify testcase [PR109752]	Patrick Palka	2	-26/+13
	This rewrites the testcase for PR109752 to make it simpler and more robust (i.e. no longer dependent on r13-4035-gc41bbfcaf9d6ef). PR c++/109752 gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-pr109752.C: Rename to ... * g++.dg/cpp2a/concepts-complete4.C: ... this. Rewrite.
2023-05-12	tree-optimization/109791 - simplify (unsigned)&foo - (unsigned)(&foo + o)	Richard Biener	1	-0/+12
	The following adds another variant of address difference simplification. The utility ptr_difference_const only handles constant differences (we also cannot code generate anything else), so exposing a possible POINTER_PLUS_EXPR in the match and computing the difference on the base only makes it possible to handle one case of a variable offset. This simplifies (unsigned long) &MEM <char[3]> [(void )&str + 2B] - (unsigned long) (&str + (_69 + 1)) down to (1 - (unsigned long) _69) during niter analysis, allowing ranger to eliminate a condition later and avoiding a bogus -Wstringop-overflow diagnostic for the testcase in the PR. PR tree-optimization/109791 match.pd (minus (convert ADDR_EXPR@0) (convert (pointer_plus @1 @2))): New pattern. (minus (convert (pointer_plus @1 @2)) (convert ADDR_EXPR@0)): Likewise.
2023-05-12	arm: [MVE intrinsics] rework vsriq	Christophe Lyon	5	-213/+5
	Implement vsriq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vsriq): New. * config/arm/arm-mve-builtins-base.def (vsriq): New. * config/arm/arm-mve-builtins-base.h (vsriq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vsriq. * config/arm/arm_mve.h (vsriq): Remove. (vsriq_m): Remove. (vsriq_n_u8): Remove. (vsriq_n_s8): Remove. (vsriq_n_u16): Remove. (vsriq_n_s16): Remove. (vsriq_n_u32): Remove. (vsriq_n_s32): Remove. (vsriq_m_n_s8): Remove. (vsriq_m_n_u8): Remove. (vsriq_m_n_s16): Remove. (vsriq_m_n_u16): Remove. (vsriq_m_n_s32): Remove. (vsriq_m_n_u32): Remove. (__arm_vsriq_n_u8): Remove. (__arm_vsriq_n_s8): Remove. (__arm_vsriq_n_u16): Remove. (__arm_vsriq_n_s16): Remove. (__arm_vsriq_n_u32): Remove. (__arm_vsriq_n_s32): Remove. (__arm_vsriq_m_n_s8): Remove. (__arm_vsriq_m_n_u8): Remove. (__arm_vsriq_m_n_s16): Remove. (__arm_vsriq_m_n_u16): Remove. (__arm_vsriq_m_n_s32): Remove. (__arm_vsriq_m_n_u32): Remove. (__arm_vsriq): Remove. (__arm_vsriq_m): Remove.
2023-05-12	arm: [MVE intrinsics] factorize vsriq	Christophe Lyon	2	-4/+6
	Factorize vsriq builtins so that they use parameterized names. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (mve_insn): Add vsri. * config/arm/mve.md (mve_vsriq_n_<supf><mode>): Rename into ... (@mve_<mve_insn>q_n_<supf><mode>): .,. this. (mve_vsriq_m_n_<supf><mode>): Rename into ... (@mve_<mve_insn>q_m_n_<supf><mode>): ... this.
2023-05-12	arm: [MVE intrinsics] add ternary_rshift shape	Christophe Lyon	2	-0/+39
	This patch adds the ternary_rshift shape description. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (ternary_rshift): New. * config/arm/arm-mve-builtins-shapes.h (ternary_rshift): New.
2023-05-12	arm: [MVE intrinsics] rework vsliq	Christophe Lyon	5	-213/+5
	Implement vsliq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vsliq): New. * config/arm/arm-mve-builtins-base.def (vsliq): New. * config/arm/arm-mve-builtins-base.h (vsliq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vsliq. * config/arm/arm_mve.h (vsliq): Remove. (vsliq_m): Remove. (vsliq_n_u8): Remove. (vsliq_n_s8): Remove. (vsliq_n_u16): Remove. (vsliq_n_s16): Remove. (vsliq_n_u32): Remove. (vsliq_n_s32): Remove. (vsliq_m_n_s8): Remove. (vsliq_m_n_s32): Remove. (vsliq_m_n_s16): Remove. (vsliq_m_n_u8): Remove. (vsliq_m_n_u32): Remove. (vsliq_m_n_u16): Remove. (__arm_vsliq_n_u8): Remove. (__arm_vsliq_n_s8): Remove. (__arm_vsliq_n_u16): Remove. (__arm_vsliq_n_s16): Remove. (__arm_vsliq_n_u32): Remove. (__arm_vsliq_n_s32): Remove. (__arm_vsliq_m_n_s8): Remove. (__arm_vsliq_m_n_s32): Remove. (__arm_vsliq_m_n_s16): Remove. (__arm_vsliq_m_n_u8): Remove. (__arm_vsliq_m_n_u32): Remove. (__arm_vsliq_m_n_u16): Remove. (__arm_vsliq): Remove. (__arm_vsliq_m): Remove.
2023-05-12	arm: [MVE intrinsics] factorize vsliq	Christophe Lyon	2	-4/+6
	Factorize vsliq builtins so that they use parameterized names. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (mve_insn>): Add vsli. * config/arm/mve.md (mve_vsliq_n_<supf><mode>): Rename into ... (@mve_<mve_insn>q_n_<supf><mode>): ... this. (mve_vsliq_m_n_<supf><mode>): Rename into ... (@mve_<mve_insn>q_m_n_<supf><mode>): ... this.
2023-05-12	arm: [MVE intrinsics] add ternary_lshift shape	Christophe Lyon	2	-0/+39
	This patch adds the ternary_lshift shape description. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (ternary_lshift): New. * config/arm/arm-mve-builtins-shapes.h (ternary_lshift): New.
2023-05-12	arm: [MVE intrinsics] rework vpselq	Christophe Lyon	4	-177/+4
	Implement vpselq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vpselq): New. * config/arm/arm-mve-builtins-base.def (vpselq): New. * config/arm/arm-mve-builtins-base.h (vpselq): New. * config/arm/arm_mve.h (vpselq): Remove. (vpselq_u8): Remove. (vpselq_s8): Remove. (vpselq_u16): Remove. (vpselq_s16): Remove. (vpselq_u32): Remove. (vpselq_s32): Remove. (vpselq_u64): Remove. (vpselq_s64): Remove. (vpselq_f16): Remove. (vpselq_f32): Remove. (__arm_vpselq_u8): Remove. (__arm_vpselq_s8): Remove. (__arm_vpselq_u16): Remove. (__arm_vpselq_s16): Remove. (__arm_vpselq_u32): Remove. (__arm_vpselq_s32): Remove. (__arm_vpselq_u64): Remove. (__arm_vpselq_s64): Remove. (__arm_vpselq_f16): Remove. (__arm_vpselq_f32): Remove. (__arm_vpselq): Remove.
2023-05-12	arm: [MVE intrinsics] add vpsel shape	Christophe Lyon	2	-0/+40
	This patch adds the vpsel shape description. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (vpsel): New. * config/arm/arm-mve-builtins-shapes.h (vpsel): New.
2023-05-12	arm: [MVE intrinsics] factorize vpselq	Christophe Lyon	3	-13/+18
	Factorize vpselq builtins so that they use parameterized names. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm.cc (arm_expand_vcond): Use gen_mve_q instead of gen_mve_vpselq. * config/arm/iterators.md (MVE_VPSELQ_F): New. (mve_insn): Add vpsel. * config/arm/mve.md (@mve_vpselq_<supf><mode>): Rename into ... (@mve_<mve_insn>q_<supf><mode>): ... this. (@mve_vpselq_f<mode>): Rename into ... (@mve_<mve_insn>q_f<mode>): ... this.
2023-05-12	arm: [MVE intrinsics] rework vfmaq vfmasq vfmsq	Christophe Lyon	5	-292/+12
	Implement vfmaq, vfmasq, vfmsq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (vfmaq, vfmasq, vfmsq): New. * config/arm/arm-mve-builtins-base.def (vfmaq, vfmasq, vfmsq): New. * config/arm/arm-mve-builtins-base.h (vfmaq, vfmasq, vfmsq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vfmaq, vfmasq, vfmsq. * config/arm/arm_mve.h (vfmaq): Remove. (vfmasq): Remove. (vfmsq): Remove. (vfmaq_m): Remove. (vfmasq_m): Remove. (vfmsq_m): Remove. (vfmaq_f16): Remove. (vfmaq_n_f16): Remove. (vfmasq_n_f16): Remove. (vfmsq_f16): Remove. (vfmaq_f32): Remove. (vfmaq_n_f32): Remove. (vfmasq_n_f32): Remove. (vfmsq_f32): Remove. (vfmaq_m_f32): Remove. (vfmaq_m_f16): Remove. (vfmaq_m_n_f32): Remove. (vfmaq_m_n_f16): Remove. (vfmasq_m_n_f32): Remove. (vfmasq_m_n_f16): Remove. (vfmsq_m_f32): Remove. (vfmsq_m_f16): Remove. (__arm_vfmaq_f16): Remove. (__arm_vfmaq_n_f16): Remove. (__arm_vfmasq_n_f16): Remove. (__arm_vfmsq_f16): Remove. (__arm_vfmaq_f32): Remove. (__arm_vfmaq_n_f32): Remove. (__arm_vfmasq_n_f32): Remove. (__arm_vfmsq_f32): Remove. (__arm_vfmaq_m_f32): Remove. (__arm_vfmaq_m_f16): Remove. (__arm_vfmaq_m_n_f32): Remove. (__arm_vfmaq_m_n_f16): Remove. (__arm_vfmasq_m_n_f32): Remove. (__arm_vfmasq_m_n_f16): Remove. (__arm_vfmsq_m_f32): Remove. (__arm_vfmsq_m_f16): Remove. (__arm_vfmaq): Remove. (__arm_vfmasq): Remove. (__arm_vfmsq): Remove. (__arm_vfmaq_m): Remove. (__arm_vfmasq_m): Remove. (__arm_vfmsq_m): Remove.
2023-05-12	arm: [MVE intrinsics] factorize vfmaq vfmsq vfmasq	Christophe Lyon	2	-108/+35
	Factorize vmvnq builtins so that they use parameterized names. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/iterators.md (MVE_FP_M_BINARY): Add VFMAQ_M_F, VFMSQ_M_F. (MVE_FP_M_N_BINARY): Add VFMAQ_M_N_F, VFMASQ_M_N_F. (MVE_VFMxQ_F, MVE_VFMAxQ_N_F): New. (mve_insn): Add vfma, vfmas, vfms. * config/arm/mve.md (mve_vfmaq_f<mode>, mve_vfmsq_f<mode>): Merge into ... (@mve_<mve_insn>q_f<mode>): ... this. (mve_vfmaq_n_f<mode>, mve_vfmasq_n_f<mode>): Merge into ... (@mve_<mve_insn>q_n_f<mode>): ... this. (mve_vfmaq_m_f<mode>, mve_vfmsq_m_f<mode>): Merge into @mve_<mve_insn>q_m_f<mode>. (mve_vfmaq_m_n_f<mode>, mve_vfmasq_m_n_f<mode>): Merge into @mve_<mve_insn>q_m_n_f<mode>.
2023-05-12	arm: [MVE intrinsics] add ternary_opt_n shape	Christophe Lyon	2	-0/+31
	This patch adds the ternary_opt_n shape description. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-shapes.cc (ternary_opt_n): New. * config/arm/arm-mve-builtins-shapes.h (ternary_opt_n): New.
2023-05-12	arm: [MVE intrinsics] rework vmvnq	Christophe Lyon	4	-438/+12
	Implement vmvnq using the new MVE builtins framework. 2022-12-12 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/arm/arm-mve-builtins-base.cc (FUNCTION_WITH_RTX_M_N_NO_F): New. (vmvnq): New. * config/arm/arm-mve-builtins-base.def (vmvnq): New. * config/arm/arm-mve-builtins-base.h (vmvnq): New. * config/arm/arm_mve.h (vmvnq): Remove. (vmvnq_m): Remove. (vmvnq_x): Remove. (vmvnq_s8): Remove. (vmvnq_s16): Remove. (vmvnq_s32): Remove. (vmvnq_n_s16): Remove. (vmvnq_n_s32): Remove. (vmvnq_u8): Remove. (vmvnq_u16): Remove. (vmvnq_u32): Remove. (vmvnq_n_u16): Remove. (vmvnq_n_u32): Remove. (vmvnq_m_u8): Remove. (vmvnq_m_s8): Remove. (vmvnq_m_u16): Remove. (vmvnq_m_s16): Remove. (vmvnq_m_u32): Remove. (vmvnq_m_s32): Remove. (vmvnq_m_n_s16): Remove. (vmvnq_m_n_u16): Remove. (vmvnq_m_n_s32): Remove. (vmvnq_m_n_u32): Remove. (vmvnq_x_s8): Remove. (vmvnq_x_s16): Remove. (vmvnq_x_s32): Remove. (vmvnq_x_u8): Remove. (vmvnq_x_u16): Remove. (vmvnq_x_u32): Remove. (vmvnq_x_n_s16): Remove. (vmvnq_x_n_s32): Remove. (vmvnq_x_n_u16): Remove. (vmvnq_x_n_u32): Remove. (__arm_vmvnq_s8): Remove. (__arm_vmvnq_s16): Remove. (__arm_vmvnq_s32): Remove. (__arm_vmvnq_n_s16): Remove. (__arm_vmvnq_n_s32): Remove. (__arm_vmvnq_u8): Remove. (__arm_vmvnq_u16): Remove. (__arm_vmvnq_u32): Remove. (__arm_vmvnq_n_u16): Remove. (__arm_vmvnq_n_u32): Remove. (__arm_vmvnq_m_u8): Remove. (__arm_vmvnq_m_s8): Remove. (__arm_vmvnq_m_u16): Remove. (__arm_vmvnq_m_s16): Remove. (__arm_vmvnq_m_u32): Remove. (__arm_vmvnq_m_s32): Remove. (__arm_vmvnq_m_n_s16): Remove. (__arm_vmvnq_m_n_u16): Remove. (__arm_vmvnq_m_n_s32): Remove. (__arm_vmvnq_m_n_u32): Remove. (__arm_vmvnq_x_s8): Remove. (__arm_vmvnq_x_s16): Remove. (__arm_vmvnq_x_s32): Remove. (__arm_vmvnq_x_u8): Remove. (__arm_vmvnq_x_u16): Remove. (__arm_vmvnq_x_u32): Remove. (__arm_vmvnq_x_n_s16): Remove. (__arm_vmvnq_x_n_s32): Remove. (__arm_vmvnq_x_n_u16): Remove. (__arm_vmvnq_x_n_u32): Remove. (__arm_vmvnq): Remove. (__arm_vmvnq_m): Remove. (__arm_vmvnq_x): Remove.