riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-10-18	[PATCH 1/7] RISC-V: Fix indentation in riscv_vector::expand_block_move [NFC]	Craig Blackmore	1	-16/+16
	gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Fix indentation.
2024-10-18	i386: Fix the order of operands in andn<MMXMODEI:mode>3 [PR117192]	Uros Bizjak	2	-3/+19
	Fix the order of operands in andn<MMXMODEI:mode>3 expander to comply with the specification, where bitwise-complement applies to operand 2. PR target/117192 gcc/ChangeLog: * config/i386/mmx.md (andn<MMXMODEI:mode>3): Swap operand indexes 1 and 2 to comply with andn specification. gcc/testsuite/ChangeLog: * gcc.target/i386/pr117192.c: New test.
2024-10-18	libstdc++: Reuse std::__assign_one in <bits/ranges_algobase.h>	Jonathan Wakely	1	-16/+6
	Use std::__assign_one instead of ranges::__assign_one. Adjust the uses, because std::__assign_one has the arguments in the opposite order (the same order as an assignment expression). libstdc++-v3/ChangeLog: * include/bits/ranges_algobase.h (ranges::__assign_one): Remove. (__copy_or_move, __copy_or_move_backward): Use std::__assign_one instead of ranges::__assign_one. Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18	libstdc++: Add always_inline to some one-liners in <bits/stl_algobase.h>	Jonathan Wakely	1	-0/+14
	We implement std::copy, std::fill etc. as a series of calls to other overloads which incrementally peel off layers of iterator wrappers. This adds a high abstraction penalty for -O0 and potentially even -O1. Add the always_inline attribute to several functions that are just a single return statement (and maybe a static_assert, or some concept-checking assertions which are disabled by default). libstdc++-v3/ChangeLog: * include/bits/stl_algobase.h (__copy_move_a1, __copy_move_a) (__copy_move_backward_a1, __copy_move_backward_a, move_backward) (__fill_a1, __fill_a, fill, __fill_n_a, fill_n, __equal_aux): Add always_inline attribute to one-line forwarding functions. Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18	libstdc++: Add nodiscard to std::find	Jonathan Wakely	1	-1/+1
	I missed this one out in r14-9478-gdf483ebd24689a but I don't think that was intentional. I see no reason std::find shouldn't be [[nodiscard]]. libstdc++-v3/ChangeLog: * include/bits/stl_algo.h (find): Add nodiscard. Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18	libstdc++: Inline memmove optimizations for std::copy etc. [PR115444]	Jonathan Wakely	7	-222/+461
	This removes all the __copy_move class template specializations that decide how to optimize std::copy and std::copy_n. We can inline those optimizations into the algorithms, using if-constexpr (and macros for C++98 compatibility) and remove the code dispatching to the various class template specializations. Doing this means we implement the optimization directly for std::copy_n instead of deferring to std::copy, That avoids the unwanted consequence of advancing the iterator in copy_n only to take the difference later to get back to the length that we already had in copy_n originally (as described in PR 115444). With the new flattened implementations, we can also lower contiguous iterators to pointers in std::copy/std::copy_n/std::copy_backwards, so that they benefit from the same memmove optimizations as pointers. There's a subtlety though: contiguous iterators can potentially throw exceptions to exit the algorithm early. So we can only transform the loop to memmove if dereferencing the iterator is noexcept. We don't check that incrementing the iterator is noexcept because we advance the contiguous iterators before using memmove, so that if incrementing would throw, that happens first. I am writing a proposal (P3349R0) which would make this unnecessary, so I hope we can drop the nothrow requirements later. This change also solves PR 114817 by checking is_trivially_assignable before optimizing copy/copy_n etc. to memmove. It's not enough to check that the types are trivially copyable (a precondition for using memmove at all), we also need to check that the specific assignment that would be performed by the algorithm is also trivial. Replacing a non-trivial assignment with memmove would be observable, so not allowed. libstdc++-v3/ChangeLog: PR libstdc++/115444 PR libstdc++/114817 * include/bits/stl_algo.h (__copy_n): Remove generic overload and overload for random access iterators. (copy_n): Inline generic version of __copy_n here. Do not defer to std::copy for random access iterators. * include/bits/stl_algobase.h (__copy_move): Remove. (__nothrow_contiguous_iterator, __memcpyable_iterators): New concepts. (__assign_one, _GLIBCXX_TO_ADDR, _GLIBCXX_ADVANCE): New helpers. (__copy_move_a2): Inline __copy_move logic and conditional memmove optimization into the most generic overload. (__copy_n_a): Likewise. (__copy_move_backward): Remove. (__copy_move_backward_a2): Inline __copy_move_backward logic and memmove optimization into the most generic overload. * testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc: New test. * testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc: New test. * testsuite/25_algorithms/copy/114817.cc: New test. * testsuite/25_algorithms/copy/115444.cc: New test. * testsuite/25_algorithms/copy_n/114817.cc: New test. Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18	libstdc++: Make __normal_iterator constexpr, always_inline, nodiscard	Jonathan Wakely	1	-45/+71
	The __gnu_cxx::__normal_iterator type we use for std::vector::iterator is not specified by the standard, it's an implementation detail. This means it's not constrained by the rule that forbids strengthening constexpr. We can make it meet the constexpr iterator requirements for older standards, not only when it's required to be for C++20. For the non-const member functions they can't be constexpr in C++11, so use _GLIBCXX14_CONSTEXPR for those. For all constructors, const members and non-member operator overloads, use _GLIBCXX_CONSTEXPR or just constexpr. We can also liberally add [[nodiscard]] and [[gnu::always_inline]] attributes to those functions. Also change some internal helpers for std::move_iterator which can be unconditionally constexpr and marked nodiscard. libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (__normal_iterator): Make all members and overloaded operators constexpr before C++20, and add always_inline attribute (__to_address): Add nodiscard and always_inline attributes. (__make_move_if_noexcept_iterator): Add nodiscard and make unconditionally constexpr. (__niter_base(__normal_iterator), __niter_base(Iter)): Add nodiscard and always_inline attributes. (__niter_base(reverse_iterator), __niter_base(move_iterator)) (__miter_base): Add inline. (__niter_wrap(From, To)): Add nodiscard attribute. (__niter_wrap(const Iter&, Iter)): Add nodiscard and always_inline attributes. Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18	libstdc++: Refactor std::uninitialized_{copy,fill,fill_n} algos [PR68350]	Jonathan Wakely	10	-106/+324
	This refactors the std::uninitialized_copy, std::uninitialized_fill and std::uninitialized_fill_n algorithms to directly perform memcpy/memset optimizations instead of dispatching to std::copy/std::fill/std::fill_n. The reasons for this are: - Use 'if constexpr' to simplify and optimize compilation throughput, so dispatching to specialized class templates is only needed for C++98 mode. - Use memcpy instead of memmove, because the conditions on non-overlapping ranges are stronger for std::uninitialized_copy than for std::copy. Using memcpy might be a minor optimization. - No special case for creating a range of one element, which std::copy needs to deal with (see PR libstdc++/108846). The uninitialized algos create new objects, which reuses storage and is allowed to clobber tail padding. - Relax the conditions for using memcpy/memset, because the C++20 rules on implicit-lifetime types mean that we can rely on memcpy to begin lifetimes of trivially copyable types. We don't need to require trivially default constructible, so don't need to limit the optimization to trivial types. See PR 68350 for more details. - Remove the dependency on std::copy and std::fill. This should mean that stl_uninitialized.h no longer needs to include all of stl_algobase.h. This isn't quite true yet, because we still use std::fill in __uninitialized_default and still use std::fill_n in __uninitialized_default_n. That will be fixed later. Several tests need changes to the diagnostics matched by dg-error because we no longer use the __constructible() function that had a static assert in. Now we just get straightforward errors for attempting to use a deleted constructor. Two tests needed more signficant changes to the actual expected results of executing the tests, because they were checking for old behaviour which was incorrect according to the standard. 20_util/specialized_algorithms/uninitialized_copy/64476.cc was expecting std::copy to be used for a call to std::uninitialized_copy involving two trivially copyable types. That was incorrect behaviour, because a non-trivial constructor should have been used, but using std::copy used trivial default initialization followed by assignment. 20_util/specialized_algorithms/uninitialized_fill_n/sizes.cc was testing the behaviour with a non-integral Size passed to uninitialized_fill_n, but I wrote the test looking at the requirements of uninitialized_copy_n which are not the same as uninitialized_fill_n. The former uses --n and tests n > 0, but the latter just tests n-- (which will never be false for a floating-point value with a fractional part). libstdc++-v3/ChangeLog: PR libstdc++/68350 PR libstdc++/93059 * include/bits/stl_uninitialized.h (__check_constructible) (_GLIBCXX_USE_ASSIGN_FOR_INIT): Remove. [C++98] (__unwrappable_niter): New trait. (__uninitialized_copy<true>): Replace use of std::copy. (uninitialized_copy): Fix Doxygen comments. Open-code memcpy optimization for C++11 and later. (__uninitialized_fill<true>): Replace use of std::fill. (uninitialized_fill): Fix Doxygen comments. Open-code memset optimization for C++11 and later. (__uninitialized_fill_n<true>): Replace use of std::fill_n. (uninitialized_fill_n): Fix Doxygen comments. Open-code memset optimization for C++11 and later. * testsuite/20_util/specialized_algorithms/uninitialized_copy/64476.cc: Adjust expected behaviour to match what the standard specifies. * testsuite/20_util/specialized_algorithms/uninitialized_fill_n/sizes.cc: Likewise. * testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc: Adjust dg-error directives. * testsuite/20_util/specialized_algorithms/uninitialized_copy/89164.cc: Likewise. * testsuite/20_util/specialized_algorithms/uninitialized_copy_n/89164.cc: Likewise. * testsuite/20_util/specialized_algorithms/uninitialized_fill/89164.cc: Likewise. * testsuite/20_util/specialized_algorithms/uninitialized_fill_n/89164.cc: Likewise. * testsuite/23_containers/vector/cons/89164.cc: Likewise. * testsuite/23_containers/vector/cons/89164_c++17.cc: Likewise. Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18	libstdc++: Move std::__niter_base and std::__niter_wrap to stl_iterator.h	Jonathan Wakely	3	-80/+132
	Move the functions for unwrapping and rewrapping __normal_iterator objects to the same file as the definition of __normal_iterator itself. This will allow a later commit to make use of std::__niter_base in other headers without having to include all of <bits/stl_algobase.h>. libstdc++-v3/ChangeLog: * include/bits/stl_algobase.h (__niter_base, __niter_wrap): Move to ... * include/bits/stl_iterator.h: ... here. (__niter_base, __miter_base): Move all overloads to the end of the header. * testsuite/24_iterators/normal_iterator/wrapping.cc: New test. Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18	SVE intrinsics: Add fold_active_lanes_to method to refactor svmul and svdiv.	Jennifer Schmitz	17	-94/+387
	As suggested in https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663275.html, this patch adds the method gimple_folder::fold_active_lanes_to (tree X). This method folds active lanes to X and sets inactive lanes according to the predication, returning a new gimple statement. That makes folding of SVE intrinsics easier and reduces code duplication in the svxxx_impl::fold implementations. Using this new method, svdiv_impl::fold and svmul_impl::fold were refactored. Additionally, the method was used for two optimizations: 1) Fold svdiv to the dividend, if the divisor is all ones and 2) for svmul, if one of the operands is all ones, fold to the other operand. Both optimizations were previously applied to _x and _m predication on the RTL level, but not for _z, where svdiv/svmul were still being used. For both optimization, codegen was improved by this patch, for example by skipping sel instructions with all-same operands and replacing sel instructions by mov instructions. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold): Refactor using fold_active_lanes_to and fold to dividend, is the divisor is all ones. (svmul_impl::fold): Refactor using fold_active_lanes_to and fold to the other operand, if one of the operands is all ones. * config/aarch64/aarch64-sve-builtins.h: Declare gimple_folder::fold_active_lanes_to (tree). * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::fold_actives_lanes_to): Add new method to fold actives lanes to given argument and setting inactives lanes according to the predication. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/div_s32.c: Adjust expected outcome. * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise. * gcc.target/aarch64/sve/fold_div_zero.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_s16.c: New test. * gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise. * gcc.target/aarch64/sve/mul_const_run.c: Likewise.
2024-10-18	[5/n] remove trapv-*.c special-casing of gcc.dg/vect/ files	Richard Biener	2	-8/+4
	The following makes -ftrapv explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named trapv-* * gcc.dg/vect/trapv-vect-reduc-4.c: Add dg-additional-options -ftrapv.
2024-10-18	[4/n] remove wrapv-*.c special-casing of gcc.dg/vect/ files	Richard Biener	6	-14/+12
	The following makes -fwrapv explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named wrapv-* * gcc.dg/vect/wrapv-vect-7.c: Add dg-additional-options -fwrapv. * gcc.dg/vect/wrapv-vect-reduc-2char.c: Likewise. * gcc.dg/vect/wrapv-vect-reduc-2short.c: Likewise. * gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c: Likewise. * gcc.dg/vect/wrapv-vect-reduc-pattern-2c.c: Likewise.
2024-10-18	[3/n] remove fast-math-*.c special-casing of gcc.dg/vect/ files	Richard Biener	53	-22/+57
	The following makes -ffast-math explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named fast-math-* * gcc.dg/vect/fast-math-bb-slp-call-1.c: Add dg-additional-options -ffast-math. * gcc.dg/vect/fast-math-bb-slp-call-2.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-3.c: Likewise. * gcc.dg/vect/fast-math-ifcvt-1.c: Likewise. * gcc.dg/vect/fast-math-pr35982.c: Likewise. * gcc.dg/vect/fast-math-pr43074.c: Likewise. * gcc.dg/vect/fast-math-pr44152.c: Likewise. * gcc.dg/vect/fast-math-pr55281.c: Likewise. * gcc.dg/vect/fast-math-slp-27.c: Likewise. * gcc.dg/vect/fast-math-slp-38.c: Likewise. * gcc.dg/vect/fast-math-vect-call-1.c: Likewise. * gcc.dg/vect/fast-math-vect-call-2.c: Likewise. * gcc.dg/vect/fast-math-vect-complex-3.c: Likewise. * gcc.dg/vect/fast-math-vect-outer-7.c: Likewise. * gcc.dg/vect/fast-math-vect-pow-1.c: Likewise. * gcc.dg/vect/fast-math-vect-pow-2.c: Likewise. * gcc.dg/vect/fast-math-vect-pr25911.c: Likewise. * gcc.dg/vect/fast-math-vect-pr29925.c: Likewise. * gcc.dg/vect/fast-math-vect-reduc-5.c: Likewise. * gcc.dg/vect/fast-math-vect-reduc-7.c: Likewise. * gcc.dg/vect/fast-math-vect-reduc-8.c: Likewise. * gcc.dg/vect/fast-math-vect-reduc-9.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-half-float.c: Likewise.
2024-10-18	[2/n] remove no-vfa-*.c special-casing of gcc.dg/vect/ files	Richard Biener	18	-9/+20
	The following makes --param vect-max-version-for-alias-checks=0 explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named no-vfa-* * gcc.dg/vect/no-vfa-pr29145.c: Add dg-additional-options --param vect-max-version-for-alias-checks=0. * gcc.dg/vect/no-vfa-vect-101.c: Likewise. * gcc.dg/vect/no-vfa-vect-102.c: Likewise. * gcc.dg/vect/no-vfa-vect-102a.c: Likewise. * gcc.dg/vect/no-vfa-vect-37.c: Likewise. * gcc.dg/vect/no-vfa-vect-43.c: Likewise. * gcc.dg/vect/no-vfa-vect-45.c: Likewise. * gcc.dg/vect/no-vfa-vect-49.c: Likewise. * gcc.dg/vect/no-vfa-vect-51.c: Likewise. * gcc.dg/vect/no-vfa-vect-53.c: Likewise. * gcc.dg/vect/no-vfa-vect-57.c: Likewise. * gcc.dg/vect/no-vfa-vect-61.c: Likewise. * gcc.dg/vect/no-vfa-vect-79.c: Likewise. * gcc.dg/vect/no-vfa-vect-depend-1.c: Likewise. * gcc.dg/vect/no-vfa-vect-depend-2.c: Likewise. * gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise. * gcc.dg/vect/no-vfa-vect-dv-2.c: Likewise.
2024-10-18	Adjust assert in vect_build_slp_tree_2	Richard Biener	1	-5/+1
	The assert in SLP discovery when we handle masked operations is confusingly wide - all gather variants should be catched by the earlier STMT_VINFO_GATHER_SCATTER_P. * tree-vect-slp.cc (vect_build_slp_tree_2): Only expect IFN_MASK_LOAD for masked loads that are not STMT_VINFO_GATHER_SCATTER_P.
2024-10-18	MAINTAINERS: Add myself as pair fusion and aarch64 ldp/stp maintainer	Alex Coplan	1	-0/+2
	ChangeLog: * MAINTAINERS (CPU Port Maintainers): Add myself as aarch64 ldp/stp maintainer. (Various Maintainers): Add myself as pair fusion maintainer.
2024-10-18	testsuite: Add necessary dejagnu directives to pr115815_0.c	Martin Jambor	1	-0/+4
	I have received an email from the Linaro infrastructure that the test gcc.dg/lto/pr115815_0.c which I added is failing on arm-eabi and I realized that not only it is missing dg-require-effective-target global_constructor but actually any dejagnu directives at all, which means it is unnecessarily running both at -O0 and -O2 and there is an unnecesary run test too. All fixed by this patch. I have not actually verified that the failure goes away on arm-eabi but have very high hopes it will. I have verified that the test still checks for the bug and also that it passes by running: make -k check-gcc RUNTESTFLAGS="lto.exp=pr115815" gcc/testsuite/ChangeLog: 2024-10-14 Martin Jambor <mjambor@suse.cz> * gcc.dg/lto/pr115815_0.c: Add dejagu directives.
2024-10-18	middle-end: Fix GSI for gcond root [PR117140]	Tamar Christina	2	-1/+95
	When finding the gsi to use for code of the root statements we should use the one of the original statement rather than the gcond which may be inside a pattern. Without this the emitted instructions may be discarded later. gcc/ChangeLog: PR tree-optimization/117140 * tree-vect-slp.cc (vectorize_slp_instance_root_stmt): Use gsi from original statement. gcc/testsuite/ChangeLog: PR tree-optimization/117140 * gcc.dg/vect/vect-early-break_129-pr117140.c: New test.
2024-10-18	middle-end: Fix VEC_PERM_EXPR lowering since relaxation of vector sizes	Tamar Christina	2	-3/+20
	In GCC 14 VEC_PERM_EXPR was relaxed to be able to permute to a 2x larger vector than the size of the input vectors. However various passes and transformations were not updated to account for this. I have patches in these area that I will be upstreaming with individual patches that expose them. This one is that vectlower tries to lower based on the size of the input vectors rather than the size of the output. As a consequence it creates an invalid vector of half the size. Luckily we ICE because the resulting nunits doesn't match the vector size. gcc/ChangeLog: * tree-vect-generic.cc (lower_vec_perm): Use output vector size instead of input vector when determining output nunits. gcc/testsuite/ChangeLog: * gcc.dg/vec-perm-lower.c: New test.
2024-10-18	AArch64: use movi d0, #0 to clear SVE registers instead of mov z0.d, #0	Tamar Christina	1	-2/+5
	This patch changes SVE to use Adv. SIMD movi 0 to clear SVE registers when not in SVE streaming mode. As the Neoverse Software Optimization guides indicate SVE mov #0 is not a zero cost move. When In streaming mode we continue to use SVE's mov to clear the registers. Tests have already been updated. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_output_sve_mov_immediate): Use fmov for SVE zeros.
2024-10-18	AArch64: support encoding integer immediates using floating point moves	Tamar Christina	2	-128/+241
	This patch extends our immediate SIMD generation cases to support generating integer immediates using floating point operation if the integer immediate maps to an exact FP value. As an example: uint32x4_t f1() { return vdupq_n_u32(0x3f800000); } currently generates: f1: adrp x0, .LC0 ldr q0, [x0, #:lo12:.LC0] ret i.e. a load, but with this change: f1: fmov v0.4s, 1.0e+0 ret Such immediates are common in e.g. our Math routines in glibc because they are created to extract or mark part of an FP immediate as masks. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_sve_valid_immediate, aarch64_simd_valid_immediate): Refactor accepting modes and values. (aarch64_float_const_representable_p): Refactor and extract FP checks into ... (aarch64_real_float_const_representable_p): ...This and fix fail fallback from real_to_integer. (aarch64_advsimd_valid_immediate): Use it. gcc/testsuite/ChangeLog: * gcc.target/aarch64/const_create_using_fmov.c: New test.
2024-10-18	AArch64: update testsuite to account for new zero moves	Tamar Christina	52	-412/+410
	The patch series will adjust how zeros are created. In principal it doesn't matter the exact lane size a zero gets created on but this makes the tests a bit fragile. This preparation patch will update the testsuite to accept multiple variants of ways to create vector zeros to accept both the current syntax and the one being transitioned to in the series. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldp_stp_18.c: Update zero regexpr. * gcc.target/aarch64/memset-corner-cases.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_bf16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_f16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_f32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_f64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_s16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_s32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_s64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_s8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_u16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_u32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_u64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acge_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acge_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acge_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acgt_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acgt_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acgt_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acle_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acle_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acle_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/aclt_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/aclt_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/aclt_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bic_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bic_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/cmpuo_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/cmpuo_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/cmpuo_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_u8.c: Likewise. * gcc.target/aarch64/sve/const_fold_div_1.c: Likewise. * gcc.target/aarch64/sve/const_fold_mul_1.c: Likewise. * gcc.target/aarch64/sve/dup_imm_1.c: Likewise. * gcc.target/aarch64/sve/fdup_1.c: Likewise. * gcc.target/aarch64/sve/fold_div_zero.c: Likewise. * gcc.target/aarch64/sve/fold_mul_zero.c: Likewise. * gcc.target/aarch64/sve/pcs/args_2.c: Likewise. * gcc.target/aarch64/sve/pcs/args_3.c: Likewise. * gcc.target/aarch64/sve/pcs/args_4.c: Likewise. * gcc.target/aarch64/vect-fmovd-zero.c: Likewise.
2024-10-18	arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpers	Christophe Lyon	1	-46/+68
	In several places we are looking for a type twice or half as large as the type suffix: this patch introduces helper functions to avoid code duplication. long_type_suffix is similar to the SVE counterpart, but adds an 'expected_tclass' parameter. half_type_suffix is similar to it, but does not exist in SVE. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (long_type_suffix): New. (half_type_suffix): New. (struct binary_move_narrow_def): Use new helper. (struct binary_move_narrow_unsigned_def): Likewise. (struct binary_rshift_narrow_def): Likewise. (struct binary_rshift_narrow_unsigned_def): Likewise. (struct binary_widen_def): Likewise. (struct binary_widen_n_def): Likewise. (struct binary_widen_opt_n_def): Likewise. (struct unary_widen_def): Likewise.
2024-10-18	arm: [MVE intrinsics] rework vsbcq vsbciq	Christophe Lyon	4	-188/+42
	Implement vsbcq vsbciq using the new MVE builtins framework. We re-use most of the code introduced by the previous patches. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): Add support for vsbciq and vsbcq. (vadciq, vadcq): Add new parameter. (vsbciq): New. (vsbcq): New. * config/arm/arm-mve-builtins-base.def (vsbciq): New. (vsbcq): New. * config/arm/arm-mve-builtins-base.h (vsbciq): New. (vsbcq): New. * config/arm/arm_mve.h (vsbciq): Delete. (vsbciq_m): Delete. (vsbcq): Delete. (vsbcq_m): Delete. (vsbciq_s32): Delete. (vsbciq_u32): Delete. (vsbciq_m_s32): Delete. (vsbciq_m_u32): Delete. (vsbcq_s32): Delete. (vsbcq_u32): Delete. (vsbcq_m_s32): Delete. (vsbcq_m_u32): Delete. (__arm_vsbciq_s32): Delete. (__arm_vsbciq_u32): Delete. (__arm_vsbciq_m_s32): Delete. (__arm_vsbciq_m_u32): Delete. (__arm_vsbcq_s32): Delete. (__arm_vsbcq_u32): Delete. (__arm_vsbcq_m_s32): Delete. (__arm_vsbcq_m_u32): Delete. (__arm_vsbciq): Delete. (__arm_vsbciq_m): Delete. (__arm_vsbcq): Delete. (__arm_vsbcq_m): Delete.
2024-10-18	arm: [MVE intrinsics] rework vadcq	Christophe Lyon	4	-94/+56
	Implement vadcq using the new MVE builtins framework. We re-use most of the code introduced by the previous patch to support vadciq: we just need to initialize carry from the input parameter. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (vadcq_vsbc): Add support for vadcq. * config/arm/arm-mve-builtins-base.def (vadcq): New. * config/arm/arm-mve-builtins-base.h (vadcq): New. * config/arm/arm_mve.h (vadcq): Delete. (vadcq_m): Delete. (vadcq_s32): Delete. (vadcq_u32): Delete. (vadcq_m_s32): Delete. (vadcq_m_u32): Delete. (__arm_vadcq_s32): Delete. (__arm_vadcq_u32): Delete. (__arm_vadcq_m_s32): Delete. (__arm_vadcq_m_u32): Delete. (__arm_vadcq): Delete. (__arm_vadcq_m): Delete.
2024-10-18	arm: [MVE intrinsics] rework vadciq	Christophe Lyon	4	-89/+95
	Implement vadciq using the new MVE builtins framework. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): New. (vadciq): New. * config/arm/arm-mve-builtins-base.def (vadciq): New. * config/arm/arm-mve-builtins-base.h (vadciq): New. * config/arm/arm_mve.h (vadciq): Delete. (vadciq_m): Delete. (vadciq_s32): Delete. (vadciq_u32): Delete. (vadciq_m_s32): Delete. (vadciq_m_u32): Delete. (__arm_vadciq_s32): Delete. (__arm_vadciq_u32): Delete. (__arm_vadciq_m_s32): Delete. (__arm_vadciq_m_u32): Delete. (__arm_vadciq): Delete. (__arm_vadciq_m): Delete.
2024-10-18	arm: [MVE intrinsics] factorize vadc vadci vsbc vsbci	Christophe Lyon	2	-109/+42
	Factorize vadc/vsbc and vadci/vsbci so that they use the same parameterized names. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (mve_insn): Add VADCIQ_M_S, VADCIQ_M_U, VADCIQ_U, VADCIQ_S, VADCQ_M_S, VADCQ_M_U, VADCQ_S, VADCQ_U, VSBCIQ_M_S, VSBCIQ_M_U, VSBCIQ_S, VSBCIQ_U, VSBCQ_M_S, VSBCQ_M_U, VSBCQ_S, VSBCQ_U. (VADCIQ, VSBCIQ): Merge into ... (VxCIQ): ... this. (VADCIQ_M, VSBCIQ_M): Merge into ... (VxCIQ_M): ... this. (VSBCQ, VADCQ): Merge into ... (VxCQ): ... this. (VSBCQ_M, VADCQ_M): Merge into ... (VxCQ_M): ... this. * config/arm/mve.md (mve_vadciq_<supf>v4si, mve_vsbciq_<supf>v4si): Merge into ... (@mve_<mve_insn>q_<supf>v4si): ... this. (mve_vadciq_m_<supf>v4si, mve_vsbciq_m_<supf>v4si): Merge into ... (@mve_<mve_insn>q_m_<supf>v4si): ... this. (mve_vadcq_<supf>v4si, mve_vsbcq_<supf>v4si): Merge into ... (@mve_<mve_insn>q_<supf>v4si): ... this. (mve_vadcq_m_<supf>v4si, mve_vsbcq_m_<supf>v4si): Merge into ... (@mve_<mve_insn>q_m_<supf>v4si): ... this.
2024-10-18	arm: [MVE intrinsics] add vadc_vsbc shape	Christophe Lyon	2	-0/+37
	This patch adds the vadc_vsbc shape description. 2024-08-28 Christophe Lyon <chrirstophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (vadc_vsbc): New. * config/arm/arm-mve-builtins-shapes.h (vadc_vsbc): New.
2024-10-18	arm: [MVE intrinsics] remove vshlcq useless expanders	Christophe Lyon	3	-81/+0
	Since we rewrote the implementation of vshlcq intrinsics, we no longer need these expanders. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-builtins.cc (arm_ternop_unone_none_unone_imm_qualifiers) (-arm_ternop_none_none_unone_imm_qualifiers): Delete. * config/arm/arm_mve_builtins.def (vshlcq_m_vec_s) (vshlcq_m_carry_s, vshlcq_m_vec_u, vshlcq_m_carry_u): Delete. * config/arm/mve.md (mve_vshlcq_vec_<supf><mode>): Delete. (mve_vshlcq_carry_<supf><mode>): Delete. (mve_vshlcq_m_vec_<supf><mode>): Delete. (mve_vshlcq_m_carry_<supf><mode>): Delete.
2024-10-18	arm: [MVE intrinsics] rework vshlcq	Christophe Lyon	6	-235/+77
	Implement vshlc using the new MVE builtins framework. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (class vshlc_impl): New. (vshlc): New. * config/arm/arm-mve-builtins-base.def (vshlcq): New. * config/arm/arm-mve-builtins-base.h (vshlcq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vshlc. * config/arm/arm_mve.h (vshlcq): Delete. (vshlcq_m): Delete. (vshlcq_s8): Delete. (vshlcq_u8): Delete. (vshlcq_s16): Delete. (vshlcq_u16): Delete. (vshlcq_s32): Delete. (vshlcq_u32): Delete. (vshlcq_m_s8): Delete. (vshlcq_m_u8): Delete. (vshlcq_m_s16): Delete. (vshlcq_m_u16): Delete. (vshlcq_m_s32): Delete. (vshlcq_m_u32): Delete. (__arm_vshlcq_s8): Delete. (__arm_vshlcq_u8): Delete. (__arm_vshlcq_s16): Delete. (__arm_vshlcq_u16): Delete. (__arm_vshlcq_s32): Delete. (__arm_vshlcq_u32): Delete. (__arm_vshlcq_m_s8): Delete. (__arm_vshlcq_m_u8): Delete. (__arm_vshlcq_m_s16): Delete. (__arm_vshlcq_m_u16): Delete. (__arm_vshlcq_m_s32): Delete. (__arm_vshlcq_m_u32): Delete. (__arm_vshlcq): Delete. (__arm_vshlcq_m): Delete. * config/arm/mve.md (mve_vshlcq_<supf><mode>): Add '@' prefix. (mve_vshlcq_m_<supf><mode>): Likewise.
2024-10-18	arm: [MVE intrinsics] add vshlc shape	Christophe Lyon	2	-0/+45
	This patch adds the vshlc shape description. 2024-08-28 Christophe Lyon <chrirstophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (vshlc): New. * config/arm/arm-mve-builtins-shapes.h (vshlc): New.
2024-10-18	arm: [MVE intrinsics] remove useless v[id]wdup expanders	Christophe Lyon	3	-90/+0
	Like with vddup/vidup, we use code_for_mve_q_wb_u_insn, so we can drop the expanders and their declarations as builtins, now useless. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-builtins.cc (arm_quinop_unone_unone_unone_unone_imm_pred_qualifiers): Delete. * config/arm/arm_mve_builtins.def (viwdupq_wb_u, vdwdupq_wb_u) (viwdupq_m_wb_u, vdwdupq_m_wb_u, viwdupq_m_n_u, vdwdupq_m_n_u) (vdwdupq_n_u, viwdupq_n_u): Delete. * config/arm/mve.md (mve_vdwdupq_n_u<mode>): Delete. (mve_vdwdupq_wb_u<mode>): Delete. (mve_vdwdupq_m_n_u<mode>): Delete. (mve_vdwdupq_m_wb_u<mode>): Delete.
2024-10-18	arm: [MVE intrinsics] update v[id]wdup tests	Christophe Lyon	18	-54/+54
	Testing v[id]wdup overloads with '1' as argument for uint32_t* does not make sense: this patch adds a new 'unit32_t a' parameter to foo2 in such tests. The difference with v[id]dup tests (where we removed 'foo2') is that in 'foo1' we test the overload with a variable 'wrap' parameter (b) and we need foo2 to test the overload with an immediate (1). 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c: Use pointer parameter in foo2. * gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c: Likewise.
2024-10-18	arm: [MVE intrinsics] rework vdwdup viwdup	Christophe Lyon	5	-737/+53
	Implement vdwdup and viwdup using the new MVE builtins framework. In order to share more code with viddup_impl, the patch swaps operands 1 and 2 in @mve_v[id]wdupq_m_wb_u<mode>_insn, so that the parameter order is similar to what @mve_v[id]dupq_m_wb_u<mode>_insn uses. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (viddup_impl): Add support for wrapping versions. (vdwdupq): New. (viwdupq): New. * config/arm/arm-mve-builtins-base.def (vdwdupq): New. (viwdupq): New. * config/arm/arm-mve-builtins-base.h (vdwdupq): New. (viwdupq): New. * config/arm/arm_mve.h (vdwdupq_m): Delete. (vdwdupq_u8): Delete. (vdwdupq_u32): Delete. (vdwdupq_u16): Delete. (viwdupq_m): Delete. (viwdupq_u8): Delete. (viwdupq_u32): Delete. (viwdupq_u16): Delete. (vdwdupq_x_u8): Delete. (vdwdupq_x_u16): Delete. (vdwdupq_x_u32): Delete. (viwdupq_x_u8): Delete. (viwdupq_x_u16): Delete. (viwdupq_x_u32): Delete. (vdwdupq_m_n_u8): Delete. (vdwdupq_m_n_u32): Delete. (vdwdupq_m_n_u16): Delete. (vdwdupq_m_wb_u8): Delete. (vdwdupq_m_wb_u32): Delete. (vdwdupq_m_wb_u16): Delete. (vdwdupq_n_u8): Delete. (vdwdupq_n_u32): Delete. (vdwdupq_n_u16): Delete. (vdwdupq_wb_u8): Delete. (vdwdupq_wb_u32): Delete. (vdwdupq_wb_u16): Delete. (viwdupq_m_n_u8): Delete. (viwdupq_m_n_u32): Delete. (viwdupq_m_n_u16): Delete. (viwdupq_m_wb_u8): Delete. (viwdupq_m_wb_u32): Delete. (viwdupq_m_wb_u16): Delete. (viwdupq_n_u8): Delete. (viwdupq_n_u32): Delete. (viwdupq_n_u16): Delete. (viwdupq_wb_u8): Delete. (viwdupq_wb_u32): Delete. (viwdupq_wb_u16): Delete. (vdwdupq_x_n_u8): Delete. (vdwdupq_x_n_u16): Delete. (vdwdupq_x_n_u32): Delete. (vdwdupq_x_wb_u8): Delete. (vdwdupq_x_wb_u16): Delete. (vdwdupq_x_wb_u32): Delete. (viwdupq_x_n_u8): Delete. (viwdupq_x_n_u16): Delete. (viwdupq_x_n_u32): Delete. (viwdupq_x_wb_u8): Delete. (viwdupq_x_wb_u16): Delete. (viwdupq_x_wb_u32): Delete. (__arm_vdwdupq_m_n_u8): Delete. (__arm_vdwdupq_m_n_u32): Delete. (__arm_vdwdupq_m_n_u16): Delete. (__arm_vdwdupq_m_wb_u8): Delete. (__arm_vdwdupq_m_wb_u32): Delete. (__arm_vdwdupq_m_wb_u16): Delete. (__arm_vdwdupq_n_u8): Delete. (__arm_vdwdupq_n_u32): Delete. (__arm_vdwdupq_n_u16): Delete. (__arm_vdwdupq_wb_u8): Delete. (__arm_vdwdupq_wb_u32): Delete. (__arm_vdwdupq_wb_u16): Delete. (__arm_viwdupq_m_n_u8): Delete. (__arm_viwdupq_m_n_u32): Delete. (__arm_viwdupq_m_n_u16): Delete. (__arm_viwdupq_m_wb_u8): Delete. (__arm_viwdupq_m_wb_u32): Delete. (__arm_viwdupq_m_wb_u16): Delete. (__arm_viwdupq_n_u8): Delete. (__arm_viwdupq_n_u32): Delete. (__arm_viwdupq_n_u16): Delete. (__arm_viwdupq_wb_u8): Delete. (__arm_viwdupq_wb_u32): Delete. (__arm_viwdupq_wb_u16): Delete. (__arm_vdwdupq_x_n_u8): Delete. (__arm_vdwdupq_x_n_u16): Delete. (__arm_vdwdupq_x_n_u32): Delete. (__arm_vdwdupq_x_wb_u8): Delete. (__arm_vdwdupq_x_wb_u16): Delete. (__arm_vdwdupq_x_wb_u32): Delete. (__arm_viwdupq_x_n_u8): Delete. (__arm_viwdupq_x_n_u16): Delete. (__arm_viwdupq_x_n_u32): Delete. (__arm_viwdupq_x_wb_u8): Delete. (__arm_viwdupq_x_wb_u16): Delete. (__arm_viwdupq_x_wb_u32): Delete. (__arm_vdwdupq_m): Delete. (__arm_vdwdupq_u8): Delete. (__arm_vdwdupq_u32): Delete. (__arm_vdwdupq_u16): Delete. (__arm_viwdupq_m): Delete. (__arm_viwdupq_u8): Delete. (__arm_viwdupq_u32): Delete. (__arm_viwdupq_u16): Delete. (__arm_vdwdupq_x_u8): Delete. (__arm_vdwdupq_x_u16): Delete. (__arm_vdwdupq_x_u32): Delete. (__arm_viwdupq_x_u8): Delete. (__arm_viwdupq_x_u16): Delete. (__arm_viwdupq_x_u32): Delete. * config/arm/mve.md (@mve_<mve_insn>q_m_wb_u<mode>_insn): Swap operands 1 and 2.
2024-10-18	arm: [MVE intrinsics] add vidwdup shape	Christophe Lyon	2	-0/+89
	This patch adds the vidwdup shape description for vdwdup and viwdup. It is very similar to viddup, but accounts for the additional 'wrap' scalar parameter. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (vidwdup): New. * config/arm/arm-mve-builtins-shapes.h (vidwdup): New.
2024-10-18	arm: [MVE intrinsics] factorize vdwdup viwdup	Christophe Lyon	2	-55/+17
	Factorize vdwdup and viwdup so that they use the same parameterized names. Like with vddup and vidup, we do not bother with the corresponding expanders, as we stop using them in a subsequent patch. The patch also adds the missing attributes to vdwdupq_wb_u_insn and viwdupq_wb_u_insn patterns. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (mve_insn): Add VIWDUPQ, VDWDUPQ, VIWDUPQ_M, VDWDUPQ_M. (VIDWDUPQ): New iterator. (VIDWDUPQ_M): New iterator. * config/arm/mve.md (mve_vdwdupq_wb_u<mode>_insn) (mve_viwdupq_wb_u<mode>_insn): Merge into ... (@mve_<mve_insn>q_wb_u<mode>_insn): ... this. Add missing mve_unpredicated_insn and mve_move attributes. (mve_vdwdupq_m_wb_u<mode>_insn, mve_viwdupq_m_wb_u<mode>_insn): Merge into ... (@mve_<mve_insn>q_m_wb_u<mode>_insn): ... this.
2024-10-18	arm: [MVE intrinsics] fix checks of immediate arguments	Christophe Lyon	1	-16/+31
	As discussed in [1], it is better to use "su64" for immediates in intrinsics signatures in order to provide better diagnostics (erroneous constants are not truncated for instance). This patch thus uses su64 instead of ss32 in binary_lshift_unsigned, binary_rshift_narrow, binary_rshift_narrow_unsigned, ternary_lshift, ternary_rshift. In addition, we fix cases where we called require_integer_immediate whereas we just want to check that the argument is a scalar, and thus use require_scalar_type in binary_acca_int32, binary_acca_int64, unary_int32_acc. Finally, in binary_lshift_unsigned we just want to check that 'imm' is an immediate, not the optional predicates. [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660262.html 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_acca_int32): Fix check of scalar argument. (binary_acca_int64): Likewise. (binary_lshift_unsigned): Likewise. (binary_rshift_narrow): Likewise. (binary_rshift_narrow_unsigned): Likewise. (ternary_lshift): Likewise. (ternary_rshift): Likewise. (unary_int32_acc): Likewise.
2024-10-18	arm: [MVE intrinsics] remove v[id]dup expanders	Christophe Lyon	2	-77/+0
	We use code_for_mve_q_u_insn, rather than the expanders used by the previous implementation, so we can remove the expanders and their declaration as builtins. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm_mve_builtins.def (vddupq_n_u, vidupq_n_u) (vddupq_m_n_u, vidupq_m_n_u): Delete. * config/arm/mve.md (mve_vidupq_n_u<mode>, mve_vidupq_m_n_u<mode>) (mve_vddupq_n_u<mode>, mve_vddupq_m_n_u<mode>): Delete.
2024-10-18	arm: [MVE intrinsics] update v[id]dup tests	Christophe Lyon	18	-282/+18
	Testing v[id]dup overloads with '1' as argument for uint32_t* does not make sense: instead of choosing the '_wb' overload, we choose the '_n', but we already do that in the '_n' tests. This patch removes all such bogus foo2 functions. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ * gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_x_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_x_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_x_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_x_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_x_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_x_wb_u8.c: Remove foo2.
2024-10-18	arm: [MVE intrinsics] rework vddup vidup	Christophe Lyon	4	-676/+116
	Implement vddup and vidup using the new MVE builtins framework. We generate better code because we take advantage of the two outputs produced by the v[id]dup instructions. For instance, before: ldr r3, [r0] sub r2, r3, #8 str r2, [r0] mov r2, r3 vddup.u16 q3, r2, #1 now: ldr r2, [r0] vddup.u16 q3, r2, #1 str r2, [r0] 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (class viddup_impl): New. (vddup): New. (vidup): New. * config/arm/arm-mve-builtins-base.def (vddupq): New. (vidupq): New. * config/arm/arm-mve-builtins-base.h (vddupq): New. (vidupq): New. * config/arm/arm_mve.h (vddupq_m): Delete. (vddupq_u8): Delete. (vddupq_u32): Delete. (vddupq_u16): Delete. (vidupq_m): Delete. (vidupq_u8): Delete. (vidupq_u32): Delete. (vidupq_u16): Delete. (vddupq_x_u8): Delete. (vddupq_x_u16): Delete. (vddupq_x_u32): Delete. (vidupq_x_u8): Delete. (vidupq_x_u16): Delete. (vidupq_x_u32): Delete. (vddupq_m_n_u8): Delete. (vddupq_m_n_u32): Delete. (vddupq_m_n_u16): Delete. (vddupq_m_wb_u8): Delete. (vddupq_m_wb_u16): Delete. (vddupq_m_wb_u32): Delete. (vddupq_n_u8): Delete. (vddupq_n_u32): Delete. (vddupq_n_u16): Delete. (vddupq_wb_u8): Delete. (vddupq_wb_u16): Delete. (vddupq_wb_u32): Delete. (vidupq_m_n_u8): Delete. (vidupq_m_n_u32): Delete. (vidupq_m_n_u16): Delete. (vidupq_m_wb_u8): Delete. (vidupq_m_wb_u16): Delete. (vidupq_m_wb_u32): Delete. (vidupq_n_u8): Delete. (vidupq_n_u32): Delete. (vidupq_n_u16): Delete. (vidupq_wb_u8): Delete. (vidupq_wb_u16): Delete. (vidupq_wb_u32): Delete. (vddupq_x_n_u8): Delete. (vddupq_x_n_u16): Delete. (vddupq_x_n_u32): Delete. (vddupq_x_wb_u8): Delete. (vddupq_x_wb_u16): Delete. (vddupq_x_wb_u32): Delete. (vidupq_x_n_u8): Delete. (vidupq_x_n_u16): Delete. (vidupq_x_n_u32): Delete. (vidupq_x_wb_u8): Delete. (vidupq_x_wb_u16): Delete. (vidupq_x_wb_u32): Delete. (__arm_vddupq_m_n_u8): Delete. (__arm_vddupq_m_n_u32): Delete. (__arm_vddupq_m_n_u16): Delete. (__arm_vddupq_m_wb_u8): Delete. (__arm_vddupq_m_wb_u16): Delete. (__arm_vddupq_m_wb_u32): Delete. (__arm_vddupq_n_u8): Delete. (__arm_vddupq_n_u32): Delete. (__arm_vddupq_n_u16): Delete. (__arm_vidupq_m_n_u8): Delete. (__arm_vidupq_m_n_u32): Delete. (__arm_vidupq_m_n_u16): Delete. (__arm_vidupq_n_u8): Delete. (__arm_vidupq_m_wb_u8): Delete. (__arm_vidupq_m_wb_u16): Delete. (__arm_vidupq_m_wb_u32): Delete. (__arm_vidupq_n_u32): Delete. (__arm_vidupq_n_u16): Delete. (__arm_vidupq_wb_u8): Delete. (__arm_vidupq_wb_u16): Delete. (__arm_vidupq_wb_u32): Delete. (__arm_vddupq_wb_u8): Delete. (__arm_vddupq_wb_u16): Delete. (__arm_vddupq_wb_u32): Delete. (__arm_vddupq_x_n_u8): Delete. (__arm_vddupq_x_n_u16): Delete. (__arm_vddupq_x_n_u32): Delete. (__arm_vddupq_x_wb_u8): Delete. (__arm_vddupq_x_wb_u16): Delete. (__arm_vddupq_x_wb_u32): Delete. (__arm_vidupq_x_n_u8): Delete. (__arm_vidupq_x_n_u16): Delete. (__arm_vidupq_x_n_u32): Delete. (__arm_vidupq_x_wb_u8): Delete. (__arm_vidupq_x_wb_u16): Delete. (__arm_vidupq_x_wb_u32): Delete. (__arm_vddupq_m): Delete. (__arm_vddupq_u8): Delete. (__arm_vddupq_u32): Delete. (__arm_vddupq_u16): Delete. (__arm_vidupq_m): Delete. (__arm_vidupq_u8): Delete. (__arm_vidupq_u32): Delete. (__arm_vidupq_u16): Delete. (__arm_vddupq_x_u8): Delete. (__arm_vddupq_x_u16): Delete. (__arm_vddupq_x_u32): Delete. (__arm_vidupq_x_u8): Delete. (__arm_vidupq_x_u16): Delete. (__arm_vidupq_x_u32): Delete.
2024-10-18	arm: [MVE intrinsics] add viddup shape	Christophe Lyon	5	-0/+133
	This patch adds the viddup shape description for vidup and vddup. This requires the addition of report_not_one_of and function_checker::require_immediate_one_of to gcc/config/arm/arm-mve-builtins.cc (they are copies of the aarch64 SVE counterpart). This patch also introduces MODE_wb. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (viddup): New. * config/arm/arm-mve-builtins-shapes.h (viddup): New. * config/arm/arm-mve-builtins.cc (report_not_one_of): New. (function_checker::require_immediate_one_of): New. * config/arm/arm-mve-builtins.def (wb): New mode. * config/arm/arm-mve-builtins.h (function_checker) Add require_immediate_one_of.
2024-10-18	arm: [MVE intrinsics] factorize vddup vidup	Christophe Lyon	2	-45/+20
	Factorize vddup and vidup so that they use the same parameterized names. This patch updates only the (define_insn "@mve_<mve_insn>q_u<mode>_insn") patterns and does not bother with the (define_expand "mve_vidupq_n_u<mode>") ones, because a subsequent patch avoids using them. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (mve_insn): Add VIDUPQ, VDDUPQ, VIDUPQ_M, VDDUPQ_M. (viddupq_op): New. (viddupq_m_op): New. (VIDDUPQ): New. (VIDDUPQ_M): New. * config/arm/mve.md (mve_vddupq_u<mode>_insn) (mve_vidupq_u<mode>_insn): Merge into ... (mve_<mve_insn>q_u<mode>_insn): ... this. (mve_vddupq_m_wb_u<mode>_insn, mve_vidupq_m_wb_u<mode>_insn): Merge into ... (mve_<mve_insn>q_m_wb_u<mode>_insn): ... this.
2024-10-18	arm: [MVE intrinsics] rework vctp	Christophe Lyon	8	-66/+79
	Implement vctp using the new MVE builtins framework. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (class vctpq_impl): New. (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-builtins-base.def (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-builtins-base.h (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-builtins-shapes.cc (vctp): New. * config/arm/arm-mve-builtins-shapes.h (vctp): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Add support for vctp. * config/arm/arm_mve.h (vctp16q): Delete. (vctp32q): Delete. (vctp64q): Delete. (vctp8q): Delete. (vctp8q_m): Delete. (vctp64q_m): Delete. (vctp32q_m): Delete. (vctp16q_m): Delete. (__arm_vctp16q): Delete. (__arm_vctp32q): Delete. (__arm_vctp64q): Delete. (__arm_vctp8q): Delete. (__arm_vctp8q_m): Delete. (__arm_vctp64q_m): Delete. (__arm_vctp32q_m): Delete. (__arm_vctp16q_m): Delete. * config/arm/mve.md (mve_vctp<MVE_vctp>q<MVE_vpred>): Add '@' prefix. (mve_vctp<MVE_vctp>q_m<MVE_vpred>): Likewise.
2024-10-18	arm: [MVE intrinsics] rework vorn	Christophe Lyon	5	-431/+57
	Implement vorn using the new MVE builtins framework. 2024-07-11 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (vornq): New. * config/arm/arm-mve-builtins-base.def (vornq): New. * config/arm/arm-mve-builtins-base.h (vornq): New. * config/arm/arm-mve-builtins-functions.h (class unspec_based_mve_function_exact_insn_vorn): New. * config/arm/arm_mve.h (vornq): Delete. (vornq_m): Delete. (vornq_x): Delete. (vornq_u8): Delete. (vornq_s8): Delete. (vornq_u16): Delete. (vornq_s16): Delete. (vornq_u32): Delete. (vornq_s32): Delete. (vornq_f16): Delete. (vornq_f32): Delete. (vornq_m_s8): Delete. (vornq_m_s32): Delete. (vornq_m_s16): Delete. (vornq_m_u8): Delete. (vornq_m_u32): Delete. (vornq_m_u16): Delete. (vornq_m_f32): Delete. (vornq_m_f16): Delete. (vornq_x_s8): Delete. (vornq_x_s16): Delete. (vornq_x_s32): Delete. (vornq_x_u8): Delete. (vornq_x_u16): Delete. (vornq_x_u32): Delete. (vornq_x_f16): Delete. (vornq_x_f32): Delete. (__arm_vornq_u8): Delete. (__arm_vornq_s8): Delete. (__arm_vornq_u16): Delete. (__arm_vornq_s16): Delete. (__arm_vornq_u32): Delete. (__arm_vornq_s32): Delete. (__arm_vornq_m_s8): Delete. (__arm_vornq_m_s32): Delete. (__arm_vornq_m_s16): Delete. (__arm_vornq_m_u8): Delete. (__arm_vornq_m_u32): Delete. (__arm_vornq_m_u16): Delete. (__arm_vornq_x_s8): Delete. (__arm_vornq_x_s16): Delete. (__arm_vornq_x_s32): Delete. (__arm_vornq_x_u8): Delete. (__arm_vornq_x_u16): Delete. (__arm_vornq_x_u32): Delete. (__arm_vornq_f16): Delete. (__arm_vornq_f32): Delete. (__arm_vornq_m_f32): Delete. (__arm_vornq_m_f16): Delete. (__arm_vornq_x_f16): Delete. (__arm_vornq_x_f32): Delete. (__arm_vornq): Delete. (__arm_vornq_m): Delete. (__arm_vornq_x): Delete.
2024-10-18	arm: [MVE intrinsics] factorize vorn	Christophe Lyon	2	-41/+10
	Factorize vorn so that they use parameterized names. 2024-07-11 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (MVE_INT_M_BINARY_LOGIC): Add VORNQ_M_S, VORNQ_M_U. (MVE_FP_M_BINARY_LOGIC): Add VORNQ_M_F. (mve_insn): Add VORNQ_M_S, VORNQ_M_U, VORNQ_M_F. * config/arm/mve.md (mve_vornq_s<mode>): Rename into ... (@mve_vornq_s<mode>): ... this. (mve_vornq_u<mode>): Rename into ... (@mve_vornq_u<mode>): ... this. (mve_vornq_f<mode>): Rename into ... (@mve_vornq_f<mode>): ... this. (mve_vornq_m_<supf><mode>): Merge into vand/vbic pattern. (mve_vornq_m_f<mode>): Likewise.
2024-10-18	arm: [MVE intrinsics] rework vbicq	Christophe Lyon	7	-577/+62
	Implement vbicq using the new MVE builtins framework. 2024-07-11 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (vbicq): New. * config/arm/arm-mve-builtins-base.def (vbicq): New. * config/arm/arm-mve-builtins-base.h (vbicq): New. * config/arm/arm-mve-builtins-functions.h (class unspec_based_mve_function_exact_insn_vbic): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Add support for vbicq. * config/arm/arm_mve.h (vbicq): Delete. (vbicq_m_n): Delete. (vbicq_m): Delete. (vbicq_x): Delete. (vbicq_u8): Delete. (vbicq_s8): Delete. (vbicq_u16): Delete. (vbicq_s16): Delete. (vbicq_u32): Delete. (vbicq_s32): Delete. (vbicq_n_u16): Delete. (vbicq_f16): Delete. (vbicq_n_s16): Delete. (vbicq_n_u32): Delete. (vbicq_f32): Delete. (vbicq_n_s32): Delete. (vbicq_m_n_s16): Delete. (vbicq_m_n_s32): Delete. (vbicq_m_n_u16): Delete. (vbicq_m_n_u32): Delete. (vbicq_m_s8): Delete. (vbicq_m_s32): Delete. (vbicq_m_s16): Delete. (vbicq_m_u8): Delete. (vbicq_m_u32): Delete. (vbicq_m_u16): Delete. (vbicq_m_f32): Delete. (vbicq_m_f16): Delete. (vbicq_x_s8): Delete. (vbicq_x_s16): Delete. (vbicq_x_s32): Delete. (vbicq_x_u8): Delete. (vbicq_x_u16): Delete. (vbicq_x_u32): Delete. (vbicq_x_f16): Delete. (vbicq_x_f32): Delete. (__arm_vbicq_u8): Delete. (__arm_vbicq_s8): Delete. (__arm_vbicq_u16): Delete. (__arm_vbicq_s16): Delete. (__arm_vbicq_u32): Delete. (__arm_vbicq_s32): Delete. (__arm_vbicq_n_u16): Delete. (__arm_vbicq_n_s16): Delete. (__arm_vbicq_n_u32): Delete. (__arm_vbicq_n_s32): Delete. (__arm_vbicq_m_n_s16): Delete. (__arm_vbicq_m_n_s32): Delete. (__arm_vbicq_m_n_u16): Delete. (__arm_vbicq_m_n_u32): Delete. (__arm_vbicq_m_s8): Delete. (__arm_vbicq_m_s32): Delete. (__arm_vbicq_m_s16): Delete. (__arm_vbicq_m_u8): Delete. (__arm_vbicq_m_u32): Delete. (__arm_vbicq_m_u16): Delete. (__arm_vbicq_x_s8): Delete. (__arm_vbicq_x_s16): Delete. (__arm_vbicq_x_s32): Delete. (__arm_vbicq_x_u8): Delete. (__arm_vbicq_x_u16): Delete. (__arm_vbicq_x_u32): Delete. (__arm_vbicq_f16): Delete. (__arm_vbicq_f32): Delete. (__arm_vbicq_m_f32): Delete. (__arm_vbicq_m_f16): Delete. (__arm_vbicq_x_f16): Delete. (__arm_vbicq_x_f32): Delete. (__arm_vbicq): Delete. (__arm_vbicq_m_n): Delete. (__arm_vbicq_m): Delete. (__arm_vbicq_x): Delete. * config/arm/mve.md (mve_vbicq_u<mode>): Rename into ... (@mve_vbicq_u<mode>): ... this. (mve_vbicq_s<mode>): Rename into ... (@mve_vbicq_s<mode>): ... this. (mve_vbicq_f<mode>): Rename into ... (@mve_vbicq_f<mode>): ... this.
2024-10-18	arm: [MVE intrinsics] rework vcvtaq vcvtmq vcvtnq vcvtpq	Christophe Lyon	5	-533/+21
	Implement vcvtaq vcvtmq vcvtnq vcvtpq using the new MVE builtins framework. 2024-07-11 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins-base.def (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins-base.h: (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins.cc (cvtx): New type. * config/arm/arm_mve.h (vcvtaq_m): Delete. (vcvtmq_m): Delete. (vcvtnq_m): Delete. (vcvtpq_m): Delete. (vcvtaq_s16_f16): Delete. (vcvtaq_s32_f32): Delete. (vcvtnq_s16_f16): Delete. (vcvtnq_s32_f32): Delete. (vcvtpq_s16_f16): Delete. (vcvtpq_s32_f32): Delete. (vcvtmq_s16_f16): Delete. (vcvtmq_s32_f32): Delete. (vcvtpq_u16_f16): Delete. (vcvtpq_u32_f32): Delete. (vcvtnq_u16_f16): Delete. (vcvtnq_u32_f32): Delete. (vcvtmq_u16_f16): Delete. (vcvtmq_u32_f32): Delete. (vcvtaq_u16_f16): Delete. (vcvtaq_u32_f32): Delete. (vcvtaq_m_s16_f16): Delete. (vcvtaq_m_u16_f16): Delete. (vcvtaq_m_s32_f32): Delete. (vcvtaq_m_u32_f32): Delete. (vcvtmq_m_s16_f16): Delete. (vcvtnq_m_s16_f16): Delete. (vcvtpq_m_s16_f16): Delete. (vcvtmq_m_u16_f16): Delete. (vcvtnq_m_u16_f16): Delete. (vcvtpq_m_u16_f16): Delete. (vcvtmq_m_s32_f32): Delete. (vcvtnq_m_s32_f32): Delete. (vcvtpq_m_s32_f32): Delete. (vcvtmq_m_u32_f32): Delete. (vcvtnq_m_u32_f32): Delete. (vcvtpq_m_u32_f32): Delete. (vcvtaq_x_s16_f16): Delete. (vcvtaq_x_s32_f32): Delete. (vcvtaq_x_u16_f16): Delete. (vcvtaq_x_u32_f32): Delete. (vcvtnq_x_s16_f16): Delete. (vcvtnq_x_s32_f32): Delete. (vcvtnq_x_u16_f16): Delete. (vcvtnq_x_u32_f32): Delete. (vcvtpq_x_s16_f16): Delete. (vcvtpq_x_s32_f32): Delete. (vcvtpq_x_u16_f16): Delete. (vcvtpq_x_u32_f32): Delete. (vcvtmq_x_s16_f16): Delete. (vcvtmq_x_s32_f32): Delete. (vcvtmq_x_u16_f16): Delete. (vcvtmq_x_u32_f32): Delete. (__arm_vcvtpq_u16_f16): Delete. (__arm_vcvtpq_u32_f32): Delete. (__arm_vcvtnq_u16_f16): Delete. (__arm_vcvtnq_u32_f32): Delete. (__arm_vcvtmq_u16_f16): Delete. (__arm_vcvtmq_u32_f32): Delete. (__arm_vcvtaq_u16_f16): Delete. (__arm_vcvtaq_u32_f32): Delete. (__arm_vcvtaq_s16_f16): Delete. (__arm_vcvtaq_s32_f32): Delete. (__arm_vcvtnq_s16_f16): Delete. (__arm_vcvtnq_s32_f32): Delete. (__arm_vcvtpq_s16_f16): Delete. (__arm_vcvtpq_s32_f32): Delete. (__arm_vcvtmq_s16_f16): Delete. (__arm_vcvtmq_s32_f32): Delete. (__arm_vcvtaq_m_s16_f16): Delete. (__arm_vcvtaq_m_u16_f16): Delete. (__arm_vcvtaq_m_s32_f32): Delete. (__arm_vcvtaq_m_u32_f32): Delete. (__arm_vcvtmq_m_s16_f16): Delete. (__arm_vcvtnq_m_s16_f16): Delete. (__arm_vcvtpq_m_s16_f16): Delete. (__arm_vcvtmq_m_u16_f16): Delete. (__arm_vcvtnq_m_u16_f16): Delete. (__arm_vcvtpq_m_u16_f16): Delete. (__arm_vcvtmq_m_s32_f32): Delete. (__arm_vcvtnq_m_s32_f32): Delete. (__arm_vcvtpq_m_s32_f32): Delete. (__arm_vcvtmq_m_u32_f32): Delete. (__arm_vcvtnq_m_u32_f32): Delete. (__arm_vcvtpq_m_u32_f32): Delete. (__arm_vcvtaq_x_s16_f16): Delete. (__arm_vcvtaq_x_s32_f32): Delete. (__arm_vcvtaq_x_u16_f16): Delete. (__arm_vcvtaq_x_u32_f32): Delete. (__arm_vcvtnq_x_s16_f16): Delete. (__arm_vcvtnq_x_s32_f32): Delete. (__arm_vcvtnq_x_u16_f16): Delete. (__arm_vcvtnq_x_u32_f32): Delete. (__arm_vcvtpq_x_s16_f16): Delete. (__arm_vcvtpq_x_s32_f32): Delete. (__arm_vcvtpq_x_u16_f16): Delete. (__arm_vcvtpq_x_u32_f32): Delete. (__arm_vcvtmq_x_s16_f16): Delete. (__arm_vcvtmq_x_s32_f32): Delete. (__arm_vcvtmq_x_u16_f16): Delete. (__arm_vcvtmq_x_u32_f32): Delete. (__arm_vcvtaq_m): Delete. (__arm_vcvtmq_m): Delete. (__arm_vcvtnq_m): Delete. (__arm_vcvtpq_m): Delete.
2024-10-18	arm: [MVE intrinsics] add vcvtx shape	Christophe Lyon	2	-0/+60
	This patch adds the vcvtx shape description for vcvtaq, vcvtmq, vcvtnq, vcvtpq. 2024-07-11 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (vcvtx): New. * config/arm/arm-mve-builtins-shapes.h (vcvtx): New.
2024-10-18	arm: [MVE intrinsics] factorize vcvtaq vcvtmq vcvtnq vcvtpq	Christophe Lyon	2	-113/+26
	Factorize vcvtaq vcvtmq vcvtnq vcvtpq builtins so that they use the same parameterized names. 2024-07-11 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (mve_insn): Add VCVTAQ_M_S, VCVTAQ_M_U, VCVTAQ_S, VCVTAQ_U, VCVTMQ_M_S, VCVTMQ_M_U, VCVTMQ_S, VCVTMQ_U, VCVTNQ_M_S, VCVTNQ_M_U, VCVTNQ_S, VCVTNQ_U, VCVTPQ_M_S, VCVTPQ_M_U, VCVTPQ_S, VCVTPQ_U. (VCVTAQ, VCVTPQ, VCVTNQ, VCVTMQ, VCVTAQ_M, VCVTMQ_M, VCVTNQ_M) (VCVTPQ_M): Delete. (VCVTxQ, VCVTxQ_M): New. * config/arm/mve.md (mve_vcvtpq_<supf><mode>) (mve_vcvtnq_<supf><mode>, mve_vcvtmq_<supf><mode>) (mve_vcvtaq_<supf><mode>): Merge into ... (@mve_<mve_insn>q_<supf><mode>): ... this. (mve_vcvtaq_m_<supf><mode>, mve_vcvtmq_m_<supf><mode>) (mve_vcvtpq_m_<supf><mode>, mve_vcvtnq_m_<supf><mode>): Merge into ... (@mve_<mve_insn>q_m_<supf><mode>): ... this.
2024-10-18	arm: [MVE intrinsics] rework vcvtbq_f16_f32 vcvttq_f16_f32 vcvtbq_f32_f16 ↵	Christophe Lyon	5	-146/+74
	vcvttq_f32_f16 Implement vcvtbq_f16_f32, vcvttq_f16_f32, vcvtbq_f32_f16 and vcvttq_f32_f16 using the new MVE builtins framework. 2024-07-11 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (class vcvtxq_impl): New. (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins-base.def (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins-base.h (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins.cc (cvt_f16_f32, cvt_f32_f16): New types. (function_instance::has_inactive_argument): Support vcvtbq and vcvttq. * config/arm/arm_mve.h (vcvttq_f32): Delete. (vcvtbq_f32): Delete. (vcvtbq_m): Delete. (vcvttq_m): Delete. (vcvttq_f32_f16): Delete. (vcvtbq_f32_f16): Delete. (vcvttq_f16_f32): Delete. (vcvtbq_f16_f32): Delete. (vcvtbq_m_f16_f32): Delete. (vcvtbq_m_f32_f16): Delete. (vcvttq_m_f16_f32): Delete. (vcvttq_m_f32_f16): Delete. (vcvtbq_x_f32_f16): Delete. (vcvttq_x_f32_f16): Delete. (__arm_vcvttq_f32_f16): Delete. (__arm_vcvtbq_f32_f16): Delete. (__arm_vcvttq_f16_f32): Delete. (__arm_vcvtbq_f16_f32): Delete. (__arm_vcvtbq_m_f16_f32): Delete. (__arm_vcvtbq_m_f32_f16): Delete. (__arm_vcvttq_m_f16_f32): Delete. (__arm_vcvttq_m_f32_f16): Delete. (__arm_vcvtbq_x_f32_f16): Delete. (__arm_vcvttq_x_f32_f16): Delete. (__arm_vcvttq_f32): Delete. (__arm_vcvtbq_f32): Delete. (__arm_vcvtbq_m): Delete. (__arm_vcvttq_m): Delete.