aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-09-01libstdc++: Add -Wno-self-move to two filesystem testsJonathan Wakely2-0/+2
libstdc++-v3/ChangeLog: * testsuite/27_io/filesystem/iterators/91067.cc: Add -Wno-self-move to options. * testsuite/27_io/filesystem/path/assign/copy.cc: Likewise.
2023-09-01c++: Move new test to 'opt' sub-directoryJonathan Wakely1-0/+0
gcc/testsuite/ChangeLog: * g++.dg/pr110879.C: Moved to... * g++.dg/opt/pr110879.C: ...here.
2023-09-01libstdc++: fix memory clobbering in std::vector [PR110879]Vladimir Palevich2-95/+121
Fix ordering to prevent clobbering of class members by a call to deallocate in _M_realloc_insert and _M_default_append. Because of recent changes in _M_realloc_insert and _M_default_append, calls to deallocate were ordered after assignment to class members of std::vector (in the guard destructor), which is causing said members to be call-clobbered. This is preventing further optimization, the compiler is unable to move memory read out of a hot loop in this case. This patch reorders the call to before assignments by putting guard in its own block. Plus a new testsuite for this case. I'm not very happy with the new testsuite, but I don't know how to properly test this. PR libstdc++/110879 libstdc++-v3/ChangeLog: * include/bits/vector.tcc (_M_realloc_insert): End guard lifetime just before assignment to class members. (_M_default_append): Likewise. gcc/testsuite/ChangeLog: * g++.dg/pr110879.C: New test. Signed-off-by: Vladimir Palevich <palevichva@gmail.com>
2023-09-01libstdc++: Use std::string::__resize_and_overwrite in std::filesystemJonathan Wakely2-25/+27
There are a few places in the std::filesystem code that use a string as a buffer for OS APIs to write to. We can use the new extension __resize_and_overwrite to avoid redundant initialization of those buffers. libstdc++-v3/ChangeLog: * src/c++17/fs_ops.cc (fs::absolute) [FILESYSTEM_IS_WINDOWS]: Use __resize_and_overwrite to fill buffer. (fs::read_symlink) [HAVE_READLINK]: Likewise. * src/filesystem/ops-common.h (get_temp_directory_from_env) [FILESYSTEM_IS_WINDOWS]: Likewise.
2023-09-01libstdc++: Use a loop in atomic_ref::compare_exchange_strong [PR111077]Jonathan Wakely2-72/+150
We need to use a loop in std::atomic_ref::compare_exchange_strong in order to properly implement the C++20 requirement that padding bits do not participate when checking the value for equality. The variable being modified by a std::atomic_ref might have an initial value with non-zero padding bits, so when the __atomic_compare_exchange built-in returns false we need to check whether that was only because of non-equal padding bits that are not part of the value representation. If the value bits differ, it's just a failed compare-exchange. If the value bits are the same, we need to retry the __atomic_compare_exchange using the value that was just read by the previous failed call. As noted in the comments, it's possible for that second try to also fail due to another thread storing the same value but with differences in padding. Because it's undefined to access a variable directly while it's held by a std::atomic_ref, and because std::atomic_ref will only ever store values with zeroed padding, we know that padding bits will never go from zero to non-zero during the lifetime of a std::atomic_ref. They can only go from an initial non-zero state to zero. This means the loop will terminate, rather than looping indefinitely as padding bits flicker on and off. In theory users could call __atomic_store etc. directly and write a value with non-zero padding bits, but we don't need to support that. Users doing that should ensure they do not write non-zero padding, to be compatibile with our std::atomic_ref's invariants. This isn't a problem for std::atomic<T>::compare_exchange_strong because the initial value (and all later stores to the variable) are performed by the library, so we ensure that stored values always have padding bits cleared. That means we can simply clear the padding bits of the 'expected' value and we will be comparing two values with equal padding bits. This means we don't need the loop for std::atomic, so update the __atomic_impl::__compare_exchange function to take a bool parameter that says whether it's being used by std::atomic_ref. If not, we can use a simpler, non-looping implementation. libstdc++-v3/ChangeLog: PR libstdc++/111077 * include/bits/atomic_base.h (__atomic_impl::__compare_exchange): Add _AtomicRef non-type template parameter and use a loop if it is true. (__atomic_impl::compare_exchange_weak): Add _AtomicRef NTTP. (__atomic_impl::compare_exchange_strong): Likewise. (atomic_ref::compare_exchange_weak): Use true for NTTP. (atomic_ref::compare_exchange_strong): Use true for NTTP. * testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc: Fix test to not rely on atomic_ref::load() to return an object with padding preserved.
2023-09-01c++: Fix up mangling of function/block scope static structured bindings ↵Jakub Jelinek11-167/+361
[PR111069] As can be seen on the testcase, we weren't correctly mangling static/thread_local structured bindings (C++20 feature) at function/block scope. The following patch fixes that by using what write_local_name does for those cases (note, structured binding mandling doesn't use the standard path because it needs to pass a list of all the identifiers in the structured binding to the mangling). In addition to that it fixes mangling of various helpers which use write_guarded_name (_ZGV*, _ZTH*, _ZTW*) and kills find_decomp_unqualified_name which for the local names would be too hard to implement and uses write_guarded_name for structured binding related _ZGR* names as well. All the mangled names on the first testcase match now clang++ and my expectations. Because the old mangled names were plain wrong (they mangled the same as structured binding at global scope and resulted in assembly errors if there was more than one static structured binding with the same identifiers in the same (or another) function, I think we don't need to play with another mangling ABI level which turns on/off the old broken way. In addition to that the patch starts to emit abi-tags into the mangle_decomp produced names when needed and emits a -Wabi warning for that as well. To make that work, I had to move cp_maybe_mangle_decomp calls from before cp_finish_decl into a middle of cp_finish_decl after type is deduced and maybe_commonize_var (which also had to be changed not to ignore structured bindings) is called but before anything might need a mangled name for the decl, so a new cp_decomp structure is passed to cp_finish_decl; various other structured binding related functions have been changed to pass pointer to that around instead of passing a tree and unsigned int separately. On decomp9.C, there is a _ZZ3barI1TB3quxEivEDC1o1pEB3qux (g++) vs. _ZZ3barI1TB3quxEivEDC1o1pE (clang++) mangling difference, but that seems to be a clang++ bug and happens also with normal static block vars, doesn't need structured bindings. 2023-09-01 Jakub Jelinek <jakub@redhat.com> PR c++/111069 gcc/ * common.opt (fabi-version=): Document version 19. * doc/invoke.texi (-fabi-version=): Likewise. gcc/c-family/ * c-opts.cc (c_common_post_options): Change latest_abi_version to 19. gcc/cp/ * cp-tree.h (determine_local_discriminator): Add NAME argument with NULL_TREE default. (struct cp_decomp): New type. (cp_finish_decl): Add DECOMP argument defaulted to nullptr. (cp_maybe_mangle_decomp): Remove declaration. (cp_finish_decomp): Add cp_decomp * argument, remove tree and unsigned args. (cp_convert_range_for): Likewise. * decl.cc (determine_local_discriminator): Add NAME argument, use it if non-NULL, otherwise compute it the old way. (maybe_commonize_var): Don't return early for structured bindings. (cp_finish_decl): Add DECOMP argument, if non-NULL, call cp_maybe_mangle_decomp. (cp_maybe_mangle_decomp): Make it static with a forward declaration. Call determine_local_discriminator. Replace FIRST and COUNT arguments with DECOMP argument. (cp_finish_decomp): Replace FIRST and COUNT arguments with DECOMP argument. * mangle.cc (find_decomp_unqualified_name): Remove. (write_unqualified_name): Don't call find_decomp_unqualified_name. (mangle_decomp): Handle mangling of static function/block scope structured bindings. Don't call decl_mangling_context twice. Call check_abi_tags, call write_abi_tags for abi version >= 19 and emit -Wabi warnings if needed. (write_guarded_var_name): Handle structured bindings. (mangle_ref_init_variable): Use write_guarded_var_name. * parser.cc (cp_parser_range_for): Adjust do_range_for_auto_deduction and cp_convert_range_for callers. (do_range_for_auto_deduction): Replace DECOMP_FIRST_NAME and DECOMP_CNT arguments with DECOMP. Adjust cp_finish_decomp caller. (cp_convert_range_for): Replace DECOMP_FIRST_NAME and DECOMP_CNT arguments with DECOMP. Don't call cp_maybe_mangle_decomp, adjust cp_finish_decl and cp_finish_decomp callers. (cp_parser_decomposition_declaration): Don't call cp_maybe_mangle_decomp, adjust cp_finish_decl and cp_finish_decomp callers. (cp_convert_omp_range_for): Adjust do_range_for_auto_deduction and cp_finish_decomp callers. (cp_finish_omp_range_for): Don't call cp_maybe_mangle_decomp, adjust cp_finish_decl and cp_finish_decomp callers. * pt.cc (tsubst_omp_for_iterator): Adjust tsubst_decomp_names caller. (tsubst_decomp_names): Replace FIRST and CNT arguments with DECOMP. (tsubst_expr): Don't call cp_maybe_mangle_decomp, adjust tsubst_decomp_names, cp_finish_decl, cp_finish_decomp and cp_convert_range_for callers. gcc/testsuite/ * g++.dg/cpp2a/decomp8.C: New test. * g++.dg/cpp2a/decomp9.C: New test. * g++.dg/abi/macro0.C: Expect __GXX_ABI_VERSION 1019 rather than 1018.
2023-09-01testsuite: Fix vectcond-1.C FAIL on i686-linux [PR19832]Jakub Jelinek1-1/+1
This test FAILs on i686-linux with .../gcc/testsuite/g++.dg/opt/vectcond-1.C:8:57: warning: MMX vector return without MMX enabled changes the ABI [-Wpsabi] .../gcc/testsuite/g++.dg/opt/vectcond-1.C:17:12: warning: MMX vector argument without MMX enabled changes the ABI [-Wpsabi] excess warning. Fixed by using -Wno-psabi. 2023-09-01 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/19832 * g++.dg/opt/vectcond-1.C: Add -Wno-psabi to dg-options.
2023-09-01testsuite: Fix up pr110915* tests on i686-linux [PR110915]Jakub Jelinek12-24/+36
These tests FAIL on i686-linux, with .../gcc/testsuite/gcc.dg/pr110915-1.c:8:1: warning: MMX vector return without MMX enabled changes the ABI [-Wpsabi] .../gcc/testsuite/gcc.dg/pr110915-1.c:7:15: warning: MMX vector argument without MMX enabled changes the ABI [-Wpsabi] excess warnings. I've added -Wno-psabi to quiet that up, plus I think it is undesirable to define macros like vector before including C library headers in case the header would use that identifier in non-obfuscated form somewhere. 2023-09-01 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/110915 * gcc.dg/pr110915-1.c: Add -Wno-psabi to dg-options. Move vector macro definition after limits.h inclusion. * gcc.dg/pr110915-2.c: Likewise. * gcc.dg/pr110915-3.c: Likewise. * gcc.dg/pr110915-4.c: Likewise. * gcc.dg/pr110915-5.c: Likewise. * gcc.dg/pr110915-6.c: Likewise. * gcc.dg/pr110915-7.c: Likewise. * gcc.dg/pr110915-8.c: Likewise. * gcc.dg/pr110915-9.c: Likewise. * gcc.dg/pr110915-10.c: Likewise. * gcc.dg/pr110915-11.c: Likewise. * gcc.dg/pr110915-12.c: Likewise.
2023-09-01RISC-V: Add conditional autovec convert(INT<->FP) patternsLehua Ding19-13/+598
gcc/ChangeLog: * config/riscv/autovec-opt.md (*cond_<optab><mode><vconvert>): New combine pattern. (*cond_<float_cvt><vconvert><mode>): Ditto. (*cond_<optab><vnconvert><mode>): Ditto. (*cond_<float_cvt><vnconvert><mode>): Ditto. (*cond_<optab><mode><vnconvert>): Ditto. (*cond_<float_cvt><mode><vnconvert>2): Ditto. * config/riscv/autovec.md (<optab><mode><vconvert>2): Adjust. (<float_cvt><vconvert><mode>2): Adjust. (<optab><vnconvert><mode>2): Adjust. (<float_cvt><vnconvert><mode>2): Adjust. (<optab><mode><vnconvert>2): Adjust. (<float_cvt><mode><vnconvert>2): Adjust. * config/riscv/riscv-v.cc (needs_fp_rounding): Add INT->FP extend. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-1.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-2.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int_run-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-1.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-2.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float_run-2.c: New test.
2023-09-01RISC-V: Add conditional autovec convert(FP<->FP) patternsLehua Ding11-11/+207
gcc/ChangeLog: * config/riscv/autovec-opt.md (*cond_extend<v_double_trunc><mode>): New combine pattern. (*cond_trunc<mode><v_double_trunc>): Ditto. * config/riscv/autovec.md: Adjust. * config/riscv/riscv-v.cc (needs_fp_rounding): Add FP extend. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-1.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-2.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float_run-2.c: New test.
2023-09-01RISC-V: Add conditional autovec convert(INT<->INT) patternsLehua Ding11-24/+311
gcc/ChangeLog: * config/riscv/autovec-opt.md (*cond_<optab><v_double_trunc><mode>): New combine pattern. (*cond_<optab><v_quad_trunc><mode>): Ditto. (*cond_<optab><v_oct_trunc><mode>): Ditto. (*cond_trunc<mode><v_double_trunc>): Ditto. * config/riscv/autovec.md (<optab><v_quad_trunc><mode>2): Adjust. (<optab><v_oct_trunc><mode>2): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/narrow-3.c: Adjust. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-1.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-2.h: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int_run-2.c: New test.
2023-09-01RISC-V: Adjust expand_cond_len_{unary,binop,op} apiLehua Ding3-22/+34
This patch change expand_cond_len_{unary,binop}'s argument `rtx_code code` to `unsigned icode` and use the icode directly to determine whether the rounding_mode operand is required. gcc/ChangeLog: * config/riscv/autovec.md: Adjust. * config/riscv/riscv-protos.h (expand_cond_len_unop): Ditto. (expand_cond_len_binop): Ditto. * config/riscv/riscv-v.cc (needs_fp_rounding): Ditto. (expand_cond_len_op): Ditto. (expand_cond_len_unop): Ditto. (expand_cond_len_binop): Ditto. (expand_cond_len_ternop): Ditto.
2023-09-01libstdc++: Use dg-require-filesystem-ts in link testJonathan Wakely1-0/+1
This test expects to be able to link, which fails if there are undefined references to chdir, mkdir etc. in fs_ops.o in the libstdc++.a archive. libstdc++-v3/ChangeLog: * testsuite/27_io/filesystem/path/108636.cc: Add dg-require for filesystem support.
2023-09-01libstdc++: Avoid useless dependency on read_symlink from tzdbJonathan Wakely1-0/+4
chrono::tzdb::current_zone uses filesystem::read_symlink, which creates a dependency on the fs_ops.o object in libstdc++.a, which then creates dependencies on several OS functions if --gc-sections isn't used. For more details see PR libstdc++/104167 comment 8 and comment 11. In the cases where that causes linker failures, we probably don't have readlink anyway, so the filesystem::read_symlink call will always fail. Repeat the preprocessor conditions for filesystem::read_symlink in the body of chrono::tzdb::current_zone so that we don't create a dependency on fs_ops.o for a function that will always fail. libstdc++-v3/ChangeLog: * src/c++20/tzdb.cc (tzdb::current_zone): Check configure macros for POSIX readlink before using filesystem::read_symlink.
2023-09-01libstdc++: Make --enable-libstdcxx-backtrace=auto default to yesJonathan Wakely2-2/+2
This causes libstdc++_libbacktrace.a to be built by default. This might fail on some targets, in which case we can make the 'auto' choice expand to either 'yes' or 'no' depending on the target. libstdc++-v3/ChangeLog: * acinclude.m4 (GLIBCXX_ENABLE_BACKTRACE): Default to yes. * configure: Regenerate.
2023-09-01RISC-V: Enable VECT_COMPARE_COSTS by defaultJuzhe-Zhong1-6/+3
since we have added COST framework, we by default enable VECT_COMPARE_COSTS. Also, add 16/32/64 to provide more choices for COST comparison. This patch doesn't change any behavior from the current testsuite since we are using default COST model. gcc/ChangeLog: * config/riscv/riscv-v.cc (autovectorize_vector_modes): Enable VECT_COMPARE_COSTS by default.
2023-09-01RISC-V: Add vec_extract for BI -> QI.Robin Dapp3-0/+121
This patch adds a vec_extract expander that extracts a QImode from a vector mask mode. In doing so, it helps recognize a "live operation"/extract last idiom for mask modes. It fixes the ICE in tree-vect-live-6.c by circumventing the fallback code in extract_bit_field_1. The problem there is still latent, though, and needs to be addressed separately. gcc/ChangeLog: * config/riscv/autovec.md (vec_extract<mode>qi): New expander. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/live-2.c: New test. * gcc.target/riscv/rvv/autovec/partial/live_run-2.c: New test.
2023-09-01testsuite/vect: Make match patterns more accurate.Robin Dapp14-18/+19
On some targets we fail to vectorize with the first type the vectorizer tries but succeed with the second. This patch changes several regex patterns to reflect that behavior. Before we would look for a single occurrence of e.g. "vect_recog_dot_prod_pattern" but would possible have two (one for each attempted mode). The new pattern tries to match sequences where we first have a "vect_recog_dot_prod_pattern" and a "succeeded" afterwards while making sure there is no "failed" or "Re-trying" in between. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-outer-4c-big-array.c: Adjust regex pattern. * gcc.dg/vect/vect-reduc-dot-s16a.c: Ditto. * gcc.dg/vect/vect-reduc-dot-s8a.c: Ditto. * gcc.dg/vect/vect-reduc-dot-s8b.c: Ditto. * gcc.dg/vect/vect-reduc-dot-u16a.c: Ditto. * gcc.dg/vect/vect-reduc-dot-u16b.c: Ditto. * gcc.dg/vect/vect-reduc-dot-u8a.c: Ditto. * gcc.dg/vect/vect-reduc-dot-u8b.c: Ditto. * gcc.dg/vect/vect-reduc-pattern-1a.c: Ditto. * gcc.dg/vect/vect-reduc-pattern-1b-big-array.c: Ditto. * gcc.dg/vect/vect-reduc-pattern-1c-big-array.c: Ditto. * gcc.dg/vect/vect-reduc-pattern-2a.c: Ditto. * gcc.dg/vect/vect-reduc-pattern-2b-big-array.c: Ditto. * gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c: Ditto.
2023-09-01RISC-V: Add dynamic LMUL compile optionJuzhe-Zhong2-1/+6
We are going to support dynamic LMUL support. gcc/ChangeLog: * config/riscv/riscv-opts.h (enum riscv_autovec_lmul_enum): Add dynamic enum. * config/riscv/riscv.opt: Add dynamic compile option.
2023-09-01libstdc++: Fix how chrono::parse handles errors for time-of-day valuesJonathan Wakely2-24/+36
We fail to diagnose an error and extract an incorrect time for cases like "25:59" >> parse("%H:%M", mins). The bad "25" hour value gets ignored (on the basis that we might not care about it if trying to extract something like a weekday or a month name), but then when we get to the end of the function we think we have a valid time from "59" and so the result is 00:59. The problem is that the '__bad_h' value is used for "no hour value read yet" as well as "bad hour value read". If we just set __h = __bad_h and continue, we can't tell later that we read an invalid hour. The fix is to set failbit early when we're trying to extract a time-of-day (e.g. duration or time_point) and we encounter an invalid hour, minute, or second value. We can still delay other error checking to the end. libstdc++-v3/ChangeLog: * include/bits/chrono_io.h (_Parser::operator()): Set failbit early if invalid values are read when _M_need & _TimeOfDay is non-zero. * testsuite/std/time/parse.cc: Check that "25:59" cannot be parsed for "%H:%M".
2023-09-01libstdc++: Do not allow chrono::parse to overflow for %C [PR111162]Jonathan Wakely2-1/+20
libstdc++-v3/ChangeLog: PR libstdc++/111162 * include/bits/chrono_io.h (_Parser::Operator()): Check %C values are in range of year::min() to year::max(). * testsuite/std/time/parse.cc: Check out of range centuries.
2023-09-01libstdc++: Simplify __format::_Sink::_M_resetJonathan Wakely1-9/+13
Using an offset as the second argument instead of an iterator makes it easier for callers, as they don't need to create an lvalue span in order to get an iterator from it for the _M_reset call. libstdc++-v3/ChangeLog: * include/std/format (__format::_Sink::_M_reset): Change second argument from iterator to offset.
2023-09-01RISC-V: Support FP ADD/SUB/MUL/DIV autovec for VLS modePan Li16-6/+634
This patch would like to allow the VLS mode autovec for the floating-point binary operation ADD/SUB/MUL/DIV. Given below code example: test (float *out, float *in1, float *in2) { for (int i = 0; i < 128; i++) out[i] = in1[i] + in2[i]; } Before this patch: test: csrr a4,vlenb slli a4,a4,1 li a5,128 bleu a5,a4,.L38 mv a5,a4 .L38: vsetvli zero,a5,e32,m8,ta,ma vle32.v v16,0(a1) vsetvli a4,zero,e32,m8,ta,ma vmv.v.i v8,0 vsetvli zero,a5,e32,m8,tu,ma vle32.v v24,0(a2) vfadd.vv v8,v24,v16 vse32.v v8,0(a0) ret After this patch: test: li a5,128 vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a2) vle32.v v2,0(a1) vfadd.vv v1,v1,v2 vse32.v v1,0(a0) ret Please note this patch also fix the execution failure of below vect test cases. * vect-alias-check-10.c * vect-alias-check-11.c * vect-alias-check-12.c * vect-alias-check-14.c Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/autovec-vls.md (<optab><mode>3): New pattern for vls floating-point autovec. * config/riscv/vector-iterators.md: New iterator for floating-point V and VLS. * config/riscv/vector.md: Add VLS to floating-point binop. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/def.h: * gcc.target/riscv/rvv/autovec/vls/floating-point-add-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-add-3.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-div-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-div-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-div-3.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-mul-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-mul-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-sub-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-sub-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/floating-point-sub-3.c: New test.
2023-08-31MATCH [PR19832]: Optimize some `(a != b) ? a OP b : c`Andrew Pinski3-0/+148
This patch adds the following match patterns to optimize these: /* (a != b) ? (a - b) : 0 -> (a - b) */ /* (a != b) ? (a ^ b) : 0 -> (a ^ b) */ /* (a != b) ? (a & b) : a -> (a & b) */ /* (a != b) ? (a | b) : a -> (a | b) */ /* (a != b) ? min(a,b) : a -> min(a,b) */ /* (a != b) ? max(a,b) : a -> max(a,b) */ /* (a != b) ? (a * b) : (a * a) -> (a * b) */ /* (a != b) ? (a + b) : (a + a) -> (a + b) */ /* (a != b) ? (a + b) : (2 * a) -> (a + b) */ Note currently only integer types (include vector types) are handled. Floating point types can be added later on. OK? Bootstrapped and tested on x86_64-linux-gnu. The first pattern had still shows up in GCC in cse.c's preferable function which was the original motivation for this patch. PR tree-optimization/19832 gcc/ChangeLog: * match.pd: Add pattern to optimize `(a != b) ? a OP b : c`. gcc/testsuite/ChangeLog: * g++.dg/opt/vectcond-1.C: New test. * gcc.dg/tree-ssa/phi-opt-same-1.c: New test.
2023-09-01LoongArch: Fix bug in loongarch_emit_stack_tie [PR110484].Lulu Cheng1-1/+3
Which may result in implicit references to $fp when frame_pointer_needed is false, causing regs_ever_live[$fp] to be true when $fp is not explicitly used, resulting in $fp being used as the target replacement register in the rnreg pass. The bug originates from SPEC2017 541.leela_r(-flto). gcc/ChangeLog: PR target/110484 * config/loongarch/loongarch.cc (loongarch_emit_stack_tie): Use the frame_pointer_needed to determine whether to use the $fp register. Co-authored-by: Guo Jie <guojie@loongson.cn>
2023-09-01Daily bump.GCC Administrator8-1/+356
2023-08-31MATCH: extend min_value/max_value match to vectorsAndrew Pinski13-8/+400
This simple patch extends the min_value/max_value match to vector integer types. Using uniform_integer_cst_p makes this easy. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. The testcases pr110915-*.c are the same as pr88784-*.c except using vector types instead. PR tree-optimization/110915 gcc/ChangeLog: * match.pd (min_value, max_value): Extend to vector constants. gcc/testsuite/ChangeLog: * gcc.dg/pr110915-1.c: New test. * gcc.dg/pr110915-10.c: New test. * gcc.dg/pr110915-11.c: New test. * gcc.dg/pr110915-12.c: New test. * gcc.dg/pr110915-2.c: New test. * gcc.dg/pr110915-3.c: New test. * gcc.dg/pr110915-4.c: New test. * gcc.dg/pr110915-5.c: New test. * gcc.dg/pr110915-6.c: New test. * gcc.dg/pr110915-7.c: New test. * gcc.dg/pr110915-8.c: New test. * gcc.dg/pr110915-9.c: New test.
2023-08-31Darwin: homogenize spelling of macOSFrancois-Xavier Coudert17-30/+28
gcc/ChangeLog: * config.in: Regenerate. * config/darwin-c.cc: Change spelling to macOS. * config/darwin-driver.cc: Likewise. * config/darwin.h: Likewise. * configure.ac: Likewise. * doc/contrib.texi: Likewise. * doc/extend.texi: Likewise. * doc/invoke.texi: Likewise. * doc/plugins.texi: Likewise. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Change spelling to macOS. * plugin.cc: Likewise. gcc/analyzer/ChangeLog: * kf.cc: Change spelling to macOS. gcc/c-family/ChangeLog: * c.opt: Change spelling to macOS. gcc/fortran/ChangeLog: * gfortran.texi: Likewise. gcc/jit/ChangeLog: * jit-playback.cc: Change spelling to macOS. gcc/objc/ChangeLog: * objc-act.cc: Change spelling to macOS.
2023-08-31RISC-V: Support rounding mode for VFNMADD/VFNMACC autovecPan Li3-31/+132
There will be a case like below for intrinsic and autovec combination. vfadd RTZ <- intrinisc static rounding vfnmadd <- autovec/autovec-opt The autovec generated vfnmadd should take DYN mode, and the frm must be restored before the vfnmadd insn. This patch would like to fix this issue by: * Add the frm operand to the autovec/autovec-opt pattern. * Set the frm_mode attr to DYN. Thus, the frm flow when combine autovec and intrinsic should be. +------------ | frrm a5 | ... | fsrmi 4 | vfadd <- intrinsic static rounding. | ... | fsrm a5 | vfnmadd <- autovec/autovec-opt | ... +------------ Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/autovec-opt.md: Add FRM_REGNUM to vfnmadd/vfnmacc. * config/riscv/autovec.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-frm-autovec-4.c: New test.
2023-08-31RISC-V: Support rounding mode for VFNMSAC/VFNMSUB autovecPan Li3-25/+124
There will be a case like below for intrinsic and autovec combination. vfadd RTZ <- intrinisc static rounding vfnmsub <- autovec/autovec-opt The autovec generated vfnmsub should take DYN mode, and the frm must be restored before the vfnmsub insn. This patch would like to fix this issue by: * Add the frm operand to the autovec/autovec-opt pattern. * Set the frm_mode attr to DYN. Thus, the frm flow when combine autovec and intrinsic should be. +------------ | frrm a5 | ... | fsrmi 4 | vfadd <- intrinsic static rounding. | ... | fsrm a5 | vfnmsub <- autovec/autovec-opt | ... +------------ Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/autovec-opt.md: Add FRM_REGNUM to vfnmsac/vfnmsub * config/riscv/autovec.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-frm-autovec-3.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2023-08-31aarch64: Fix return register handling in untyped_callRichard Sandiford1-1/+19
While working on another patch, I hit a problem with the aarch64 expansion of untyped_call. The expander emits the usual: (set (mem ...) (reg resN)) instructions to store the result registers to memory, but it didn't say in RTL where those resN results came from. This eventually led to a failure of gcc.dg/torture/stackalign/builtin-return-2.c, via regrename. This patch turns the untyped call from a plain call to a call_value, to represent that the call returns (or might return) a useful value. The patch also uses a PARALLEL return rtx to represent all the possible return registers. gcc/ * config/aarch64/aarch64.md (untyped_call): Emit a call_value rather than a call. List each possible destination register in the call pattern.
2023-08-31rs6000: Update instruction counts to match vec_* calls [PR111228]Peter Bergner8-12/+12
Commit r14-3258-ge7a36e4715c716 increased the amount of folding we perform, leading to better code. Update the expected instruction counts to match the changes. 2023-08-31 Peter Bergner <bergner@linux.ibm.com> gcc/testsuite/ PR testsuite/111228 * gcc.target/powerpc/fold-vec-logical-ors-char.c: Update instruction counts to match the number of associated vec_* built-in calls. * gcc.target/powerpc/fold-vec-logical-ors-int.c: Likewise. * gcc.target/powerpc/fold-vec-logical-ors-longlong.c: Likewise. * gcc.target/powerpc/fold-vec-logical-ors-short.c: Likewise. * gcc.target/powerpc/fold-vec-logical-other-char.c: Likewise. * gcc.target/powerpc/fold-vec-logical-other-int.c: Likewise. * gcc.target/powerpc/fold-vec-logical-other-longlong.c: Likewise. * gcc.target/powerpc/fold-vec-logical-other-short.c: Likewise.
2023-08-31RISC-V: Support rounding mode for VFMSAC/VFMSUB autovecPan Li3-27/+127
There will be a case like below for intrinsic and autovec combination. vfadd RTZ <- intrinisc static rounding vfmsub <- autovec/autovec-opt The autovec generated vfmsub should take DYN mode, and the frm must be restored before the vfmsub insn. This patch would like to fix this issue by: * Add the frm operand to the autovec/autovec-opt pattern. * Set the frm_mode attr to DYN. Thus, the frm flow when combine autovec and intrinsic should be. +------------ | frrm a5 | ... | fsrmi 4 | vfadd <- intrinsic static rounding. | ... | fsrm a5 | vfmsub <- autovec/autovec-opt | ... +------------ Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/autovec-opt.md: Add FRM_REGNUM to vfmsac/vfmsub * config/riscv/autovec.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-frm-autovec-2.c: New test.
2023-08-31RISC-V: Support rounding mode for VFMADD/VFMACC autovecPan Li4-23/+125
There will be a case like below for intrinsic and autovec combination vfadd RTZ <- intrinisc static rounding vfmadd <- autovec/autovec-opt The autovec generated vfmadd should take DYN mode, and the frm must be restored before the vfmadd insn. This patch would like to fix this issue by: * Add the frm operand to the vfmadd/vfmacc autovec/autovec-opt pattern. * Set the frm_mode attr to DYN. Thus, the frm flow when combine autovec and intrinsic should be. +------------ | frrm a5 | ... | fsrmi 4 | vfadd <- intrinsic static rounding. | ... | fsrm a5 | vfmadd <- autovec/autovec-opt | ... +------------ However, we leverage unspec instead of use to consume the FRM register because there are some restrictions from the combine pass. Some code path of try_combine may require the XVECLEN(pat, 0) == 2 for the recog_for_combine, and add new use will make the XVECLEN(pat, 0) == 3 and result in the vfwmacc optimization failure. For example, in the test widen-complicate-5.c and widen-8.c Finally, there will be other fma cases and they will be covered in the underlying patches. Signed-off-by: Pan Li <pan2.li@intel.com> Co-Authored-By: Ju-Zhe Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/autovec-opt.md: Add FRM_REGNUM to vfmadd/vfmacc. * config/riscv/autovec.md: Ditto. * config/riscv/vector-iterators.md: Add UNSPEC_VFFMA. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-frm-autovec-1.c: New test.
2023-08-31middle-end/111253 - partly revert r11-6508-gabb1b6058c09a7Richard Biener2-1/+32
The following keeps dumping SSA def stmt RHS during diagnostic reporting only for gimple_assign_single_p defs which means memory loads. This avoids diagnostics containing PHI nodes like warning: 'realloc' called on pointer '*_42 = PHI <lcs.14_40(29), lcs.19_48(30)>.t_mem_caches' with nonzero offset 40 instead getting back the previous behavior: warning: 'realloc' called on pointer '*<unknown>.t_mem_caches' with nonzero offset 40 PR middle-end/111253 gcc/c-family/ * c-pretty-print.cc (c_pretty_printer::primary_expression): Only dump gimple_assign_single_p SSA def RHS. gcc/testsuite/ * gcc.dg/Wfree-nonheap-object-7.c: New testcase.
2023-08-31RISC-V: Add vector_scalar_shift_operandPalmer Dabbelt2-3/+8
The vector shift immediates happen to have the same constraints as some of the CSR-related operands, but it's a different usage. This adds a name for them, so I don't get confused again next time. gcc/ChangeLog: * config/riscv/autovec.md (shifts): Use vector_scalar_shift_operand. * config/riscv/predicates.md (vector_scalar_shift_operand): New predicate.
2023-08-31RISC-V: Add Vector cost model framework for RVVJuzhe-Zhong5-1/+134
Hi, currently RVV vectorization only support picking LMUL according to compile option --param=riscv-autovec-lmul= which is no ideal. Compiler should be able to pick optimal LMUL/vectorization factor to vectorize the loop according to the loop_vec_info and SSA-based register pressure analysis. Now, I figure out current GCC cost model provide the approach that we can choose LMUL/vectorization factor by adjusting the COST. This patch is just add the minimum COST model framework which is still applying the default cost model (No vector codes changed from before). Regression all pased and no difference. gcc/ChangeLog: * config.gcc: Add vector cost model framework for RVV. * config/riscv/riscv.cc (riscv_vectorize_create_costs): Ditto. (TARGET_VECTORIZE_CREATE_COSTS): Ditto. * config/riscv/t-riscv: Ditto. * config/riscv/riscv-vector-costs.cc: New file. * config/riscv/riscv-vector-costs.h: New file.
2023-08-31rs6000: Don't allow AltiVec address in movoo & movxo pattern [PR110411]Jeevitha4-5/+38
There are no instructions that do traditional AltiVec addresses (i.e. with the low four bits of the address masked off) for OOmode and XOmode objects. The solution is to modify the constraints used in the movoo and movxo pattern to disallow these types of addresses, which assists LRA in resolving this issue. Furthermore, the mode size 16 check has been removed in vsx_quad_dform_memory_operand to allow OOmode and XOmode, and quad_address_p already handles less than size 16. 2023-08-31 Jeevitha Palanisamy <jeevitha@linux.ibm.com> gcc/ PR target/110411 * config/rs6000/mma.md (define_insn_and_split movoo): Disallow AltiVec address operands. (define_insn_and_split movxo): Likewise. * config/rs6000/predicates.md (vsx_quad_dform_memory_operand): Remove redundant mode size check. gcc/testsuite/ PR target/110411 * gcc.target/powerpc/pr110411-1.c: New testcase. * gcc.target/powerpc/pr110411-2.c: New testcase.
2023-08-31RISC-V: Change vsetvl tail and mask policy to default policyLehua Ding8-13/+26
This patch change the vsetvl policy to default policy (returned by get_prefer_mask_policy and get_prefer_tail_policy) instead fixed policy. Any policy is now returned, allowing change to agnostic or undisturbed. In the future, users may be able to control the default policy, such as keeping agnostic by compiler options. gcc/ChangeLog: * config/riscv/riscv-protos.h (IS_AGNOSTIC): Move to here. * config/riscv/riscv-v.cc (gen_no_side_effects_vsetvl_rtx): Change to default policy. * config/riscv/riscv-vector-builtins-bases.cc: Change to default policy. * config/riscv/riscv-vsetvl.h (IS_AGNOSTIC): Delete. * config/riscv/riscv.cc (riscv_print_operand): Use IS_AGNOSTIC to test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/binop_vx_constraint-171.c: Adjust. * gcc.target/riscv/rvv/base/binop_vx_constraint-173.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-24.c: New test.
2023-08-31Fix gcc.dg/tree-ssa/forwprop-42.cRichard Biener1-1/+2
The testcase requires hardware support for V2DImode vectors because otherwise we do not rewrite inserts via BIT_FIELD_REF to BIT_INSERT_EXPR. There's no effective target for this so the following makes the testcase x86 specific, requiring and enabling SSE2. * gcc.dg/tree-ssa/forwprop-42.c: Move ... * gcc.target/i386/pr111228.c: ... here. Enable SSE2.
2023-08-31RISC-V: Refactor and clean emit_{vlmax,nonvlmax}_xxx functionsLehua Ding7-952/+587
This patch refactor the code of emit_{vlmax,nonvlmax}_xxx functions. These functions are used to generate RVV insn. There are currently 31 such functions and a few duplicates. The reason so many functions are needed is because there are more types of RVV instructions. There are patterns that don't have mask operand, patterns that don't have merge operand, and patterns that don't need a tail policy operand, etc. Previously there was the insn_type enum, but it's value was just used to indicate how many operands were passed in by caller. The rest of the operands information is scattered throughout these functions. For example, emit_vlmax_fp_insn indicates that a rounding mode operand of FRM_DYN should also be passed, emit_vlmax_merge_insn means that there is no mask operand or mask policy operand. I introduced a new enum insn_flags to indicate some properties of these RVV patterns. These insn_flags are then used to define insn_type enum. For example for the defintion of WIDEN_TERNARY_OP: WIDEN_TERNARY_OP = HAS_DEST_P | HAS_MASK_P | USE_ALL_TRUES_MASK_P | TDEFAULT_POLICY_P | MDEFAULT_POLICY_P | TERNARY_OP_P, This flags mean the RVV pattern has no merge operand. This flags only apply to vwmacc instructions. After defining the desired insn_type, all the emit_{vlmax,nonvlmax}_xxx functions are unified into three functions: emit_vlmax_insn (icode, insn_flags, ops); emit_nonvlmax_insn (icode, insn_flags, ops, vl); emit_vlmax_insn_lra (icode, insn_flags, ops, vl); Then user can select the appropriate insn_type and the appropriate emit_xxx function for RVV patterns generation as needed. gcc/ChangeLog: * config/riscv/autovec-opt.md: Adjust. * config/riscv/autovec-vls.md: Ditto. * config/riscv/autovec.md: Ditto. * config/riscv/riscv-protos.h (enum insn_type): Add insn_type. (enum insn_flags): Add insn flags. (emit_vlmax_insn): Adjust. (emit_vlmax_fp_insn): Delete. (emit_vlmax_ternary_insn): Delete. (emit_vlmax_fp_ternary_insn): Delete. (emit_nonvlmax_insn): Adjust. (emit_vlmax_slide_insn): Delete. (emit_nonvlmax_slide_tu_insn): Delete. (emit_vlmax_merge_insn): Delete. (emit_vlmax_cmp_insn): Delete. (emit_vlmax_cmp_mu_insn): Delete. (emit_vlmax_masked_mu_insn): Delete. (emit_scalar_move_insn): Delete. (emit_nonvlmax_integer_move_insn): Delete. (emit_vlmax_insn_lra): Add. * config/riscv/riscv-v.cc (get_mask_mode_from_insn_flags): New. (emit_vlmax_insn): Adjust. (emit_nonvlmax_insn): Adjust. (emit_vlmax_insn_lra): Add. (emit_vlmax_fp_insn): Delete. (emit_vlmax_ternary_insn): Delete. (emit_vlmax_fp_ternary_insn): Delete. (emit_vlmax_slide_insn): Delete. (emit_nonvlmax_slide_tu_insn): Delete. (emit_nonvlmax_slide_insn): Delete. (emit_vlmax_merge_insn): Delete. (emit_vlmax_cmp_insn): Delete. (emit_vlmax_cmp_mu_insn): Delete. (emit_vlmax_masked_insn): Delete. (emit_nonvlmax_masked_insn): Delete. (emit_vlmax_masked_store_insn): Delete. (emit_nonvlmax_masked_store_insn): Delete. (emit_vlmax_masked_mu_insn): Delete. (emit_vlmax_masked_fp_mu_insn): Delete. (emit_nonvlmax_tu_insn): Delete. (emit_nonvlmax_fp_tu_insn): Delete. (emit_nonvlmax_tumu_insn): Delete. (emit_nonvlmax_fp_tumu_insn): Delete. (emit_scalar_move_insn): Delete. (emit_cpop_insn): Delete. (emit_vlmax_integer_move_insn): Delete. (emit_nonvlmax_integer_move_insn): Delete. (emit_vlmax_gather_insn): Delete. (emit_vlmax_masked_gather_mu_insn): Delete. (emit_vlmax_compress_insn): Delete. (emit_nonvlmax_compress_insn): Delete. (emit_vlmax_reduction_insn): Delete. (emit_vlmax_fp_reduction_insn): Delete. (emit_nonvlmax_fp_reduction_insn): Delete. (expand_vec_series): Adjust. (expand_const_vector): Adjust. (legitimize_move): Adjust. (sew64_scalar_helper): Adjust. (expand_tuple_move): Adjust. (expand_vector_init_insert_elems): Adjust. (expand_vector_init_merge_repeating_sequence): Adjust. (expand_vec_cmp): Adjust. (expand_vec_cmp_float): Adjust. (expand_vec_perm): Adjust. (shuffle_merge_patterns): Adjust. (shuffle_compress_patterns): Adjust. (shuffle_decompress_patterns): Adjust. (expand_load_store): Adjust. (expand_cond_len_op): Adjust. (expand_cond_len_unop): Adjust. (expand_cond_len_binop): Adjust. (expand_gather_scatter): Adjust. (expand_cond_len_ternop): Adjust. (expand_reduction): Adjust. (expand_lanes_load_store): Adjust. (expand_fold_extract_last): Adjust. * config/riscv/riscv.cc (vector_zero_call_used_regs): Adjust. * config/riscv/vector.md: Adjust.
2023-08-31Adjust gcc.target/i386/pr52252-{atom,core}.cRichard Biener2-2/+2
The following adjusts the testcases to force 128bit vectorization to make them more robust when for example adding -march=cascadelake * gcc.target/i386/pr52252-atom.c: Add -mprefer-vector-width=128. * gcc.target/i386/pr52252-core.c: Likewise.
2023-08-31rs6000: call vector load/store with length only on 64-bit Power10Haochen Gui2-4/+23
gcc/ PR target/96762 * config/rs6000/rs6000-string.cc (expand_block_move): Call vector load/store with length only on 64-bit Power10. gcc/testsuite/ PR target/96762 * gcc.target/powerpc/pr96762.c: New.
2023-08-31arc: Honor SWAP option for lsl16 instructionClaudiu Zissulescu2-2/+2
The LSL16 instruction is only available if SWAP (-mswap) option is turned on. gcc/ChangeLog: * config/arc/arc.cc (arc_split_mov_const): Use LSL16 only when SWAP option is enabled. * config/arc/arc.md (ashlsi2_cnt16): Likewise. Signed-off-by: Claudiu Zissulescu <claziss@gmail.com>
2023-08-31arm: Remove unsigned variant of vcaddq_mStamatis Markianos-Wright5-29/+21
The unsigned variants of the vcaddq_m operation are not needed within the compiler, as the assembly output of the signed and unsigned versions of the ops is identical: with a `.i` suffix (as opposed to separate `.s` and `.u` suffixes). Tested with baremetal arm-none-eabi on Arm's fastmodels. gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (vcaddq_rot90, vcaddq_rot270): Use common insn for signed and unsigned front-end definitions. * config/arm/arm_mve_builtins.def (vcaddq_rot90_m_u, vcaddq_rot270_m_u): Make common. (vcaddq_rot90_m_s, vcaddq_rot270_m_s): Remove. * config/arm/iterators.md (mve_insn): Merge signed and unsigned defs. (isu): Likewise. (rot): Likewise. (mve_rot): Likewise. (supf): Likewise. (VxCADDQ_M): Likewise. * config/arm/unspecs.md (unspec): Likewise. * config/arm/mve.md: Fix minor typo.
2023-08-31Refactor vector HF/BF mode iterators and patterns.liuhongt1-130/+108
gcc/ChangeLog: * config/i386/sse.md (<avx512>_blendm<mode>): Merge VF_AVX512HFBFVL into VI12HFBF_AVX512VL. (VF_AVX512HFBF16): Renamed to VHFBF. (VF_AVX512FP16VL): Renamed to VHF_AVX512VL. (VF_AVX512FP16): Removed. (div<mode>3): Adjust VF_AVX512FP16VL to VHF_AVX512VL. (avx512fp16_rcp<mode>2<mask_name>): Ditto. (rsqrt<mode>2): Ditto. (<sse>_rsqrt<mode>2<mask_name>): Ditto. (vcond<mode><code>): Ditto. (vcond<sseintvecmodelower><mode>): Ditto. (<avx512>_fmaddc_<mode>_mask1<round_expand_name>): Ditto. (<avx512>_fmaddc_<mode>_maskz<round_expand_name>): Ditto. (<avx512>_fcmaddc_<mode>_mask1<round_expand_name>): Ditto. (<avx512>_fcmaddc_<mode>_maskz<round_expand_name>): Ditto. (cmla<conj_op><mode>4): Ditto. (fma_<mode>_fadd_fmul): Ditto. (fma_<mode>_fadd_fcmul): Ditto. (fma_<complexopname>_<mode>_fma_zero): Ditto. (fma_<mode>_fmaddc_bcst): Ditto. (fma_<mode>_fcmaddc_bcst): Ditto. (<avx512>_<complexopname>_<mode>_mask<round_name>): Ditto. (cmul<conj_op><mode>3): Ditto. (<avx512>_<complexopname>_<mode><maskc_name><round_name>): Ditto. (vec_unpacks_lo_<mode>): Ditto. (vec_unpacks_hi_<mode>): Ditto. (vec_unpack_<fixprefix>fix_trunc_lo_<mode>): Ditto. (vec_unpack_<fixprefix>fix_trunc_lo_<mode>): Ditto. (*vec_extract<mode>_0): Ditto. (*<avx512>_cmp<mode>3): Extend to V48H_AVX512VL.
2023-08-31RISC-V: Fix vsetvl pass ICELehua Ding2-1/+20
This patch fix pr111234 (a vsetvl pass ICE) when fuse a mask any vlmax vsetvl_vtype_change_only insn with a mu vsetvl insn. PR target/111234 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): Remove condition. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr111234.c: New test.
2023-08-31Add overflow API for plus minus mult on rangeJiufu Guo5-0/+154
In previous reviews, adding overflow APIs to range-op would be useful. Those APIs could help to check if overflow happens when operating between two 'range's, like: plus, minus, and mult. Previous discussions are here: https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624067.html https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624701.html gcc/ChangeLog: * range-op-mixed.h (operator_plus::overflow_free_p): New declare. (operator_minus::overflow_free_p): New declare. (operator_mult::overflow_free_p): New declare. * range-op.cc (range_op_handler::overflow_free_p): New function. (range_operator::overflow_free_p): New default function. (operator_plus::overflow_free_p): New function. (operator_minus::overflow_free_p): New function. (operator_mult::overflow_free_p): New function. * range-op.h (range_op_handler::overflow_free_p): New declare. (range_operator::overflow_free_p): New declare. * value-range.cc (irange::nonnegative_p): New function. (irange::nonpositive_p): New function. * value-range.h (irange::nonnegative_p): New declare. (irange::nonpositive_p): New declare.
2023-08-31Daily bump.GCC Administrator6-1/+321
2023-08-30analyzer: implement reference count checking for CPython plugin [PR107646]Eric Feng10-44/+550
This patch introduces initial support for reference count checking of PyObjects in relation to the Python/C API for the CPython plugin. Additionally, the core analyzer underwent several modifications to accommodate this feature. These include: - Introducing support for callbacks at the end of region_model::pop_frame. This is our current point of validation for the reference count of PyObjects. - An added optional custom stmt_finder parameter to region_model_context::warn. This aids in emitting a diagnostic concerning the reference count, especially when the stmt_finder is NULL, which is currently the case during region_model::pop_frame. The current diagnostic we emit relating to the reference count appears as follows: rc3.c:23:10: warning: expected ‘item’ to have reference count: ‘1’ but ob_refcnt field is: ‘2’ 23 | return list; | ^~~~ ‘create_py_object’: events 1-4 | | 4 | PyObject* item = PyLong_FromLong(3); | | ^~~~~~~~~~~~~~~~~~ | | | | | (1) when ‘PyLong_FromLong’ succeeds | 5 | PyObject* list = PyList_New(1); | | ~~~~~~~~~~~~~ | | | | | (2) when ‘PyList_New’ succeeds |...... | 14 | PyList_Append(list, item); | | ~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (3) when ‘PyList_Append’ succeeds, moving buffer |...... | 23 | return list; | | ~~~~ | | | | | (4) here | This is a WIP in several ways: - Currently, functions returning PyObject * are assumed to always produce a new reference. - The validation of reference count is only for PyObjects created within a function body. Verifying reference counts for PyObjects passed as parameters is not supported in this patch. gcc/analyzer/ChangeLog: PR analyzer/107646 * engine.cc (impl_region_model_context::warn): New optional parameter. * exploded-graph.h (class impl_region_model_context): Likewise. * region-model.cc (region_model::pop_frame): New callback feature for region_model::pop_frame. * region-model.h (struct append_regions_cb_data): Likewise. (class region_model): Likewise. (class region_model_context): New optional parameter. (class region_model_context_decorator): Likewise. gcc/testsuite/ChangeLog: PR analyzer/107646 * gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count checking for PyObjects. * gcc.dg/plugin/cpython-plugin-test-2.c: Moved to... * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and added more tests). * gcc.dg/plugin/cpython-plugin-test-1.c: Moved to... * gcc.dg/plugin/cpython-plugin-test-no-Python-h.c: ...here (and added more tests). * gcc.dg/plugin/plugin.exp: New tests. * gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test. * gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test. Signed-off-by: Eric Feng <ef2648@columbia.edu>