aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-10-12Fortran/OpenMP: Warn when mapping polymorphic variablesTobias Burnus4-2/+121
OpenMP (TR13) states for Fortran: * For map: "If a list item has polymorphic type, the behavior is unspecified." * "If the firstprivate clause is on a target construct and a variable is of polymorphic type, the behavior is unspecified." which this commit now warns for. gcc/fortran/ChangeLog: * openmp.cc (resolve_omp_clauses): Diagnose polymorphic mapping. * trans-openmp.cc (gfc_omp_finish_clause): Warn when polymorphic variable is implicitly mapped. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/polymorphic-mapping.f90: New test. * gfortran.dg/gomp/polymorphic-mapping-2.f90: New test.
2024-10-12bootstrap: Fix genmatch build where system gcc defaults to -fPIE -pieJakub Jelinek1-0/+5
Seems our buildbot is unhappy about my latest commit to link genmatch with libcommon.a in order to support gcc_diag diagnostics in libcpp. We have in gcc/configure.ac: if test x$enable_host_shared = xyes; then PICFLAG=-fPIC elif test x$enable_host_pie = xyes; then PICFLAG=-fPIE elif test x$gcc_cv_c_no_fpie = xyes; then PICFLAG=-fno-PIE else PICFLAG= fi if test x$enable_host_pie = xyes; then LD_PICFLAG=-pie elif test x$gcc_cv_no_pie = xyes; then LD_PICFLAG=-no-pie else LD_PICFLAG= fi if test x$enable_host_bind_now = xyes; then LD_PICFLAG="$LD_PICFLAG -Wl,-z,now" fi Now, for object files linked into cc1, cc1plus, xgcc etc. we carefully arrange for them to be compiled with $(PICFLAG) and do the link with $(LD_PICFLAG). For the generator programs, we don't do anything like that, we simply compile their objects without $(PICFLAG) and link without $(LD_PICFLAG). It isn't that big deal, the generator programs runs once or a couple of times during the build and that is it, we don't ship them and don't care much if they are PIE or not. Except that after my changes to link in libcommon.a into build/genmatch, we now link -fno-PIE compiled objects into a binary which is linked with default flags. Our distro compiler just links a normal executable and everything works fine (-fPIE/-pie is added through spec file snippet and just added in rpm default flags), but seems the buildbot system gcc defaults to -fPIE -pie instead and so building build/genmatch fails. The following patch is a minimal fix for that, just add -no-pie when linking build/genmatch, but don't add -pie. If we wanted to start building even the build/gen* tools with $(PICFLAG) and $(LD_PICFLAG), that would be much larger change. 2024-10-12 Jakub Jelinek <jakub@redhat.com> * Makefile.in (LINKER_FOR_BUILD): Append -no-pie if it is in $(LD_PICFLAG) when building build/genmatch.
2024-10-12gcc.target/i386/pr55583.c: Use long long for 64-bit integerH.J. Lu1-3/+3
Since long is 32-bit for x32, use long long for 64-bit integer. * gcc.target/i386/pr55583.c: Use long long for 64-bit integer. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12gcc.target/i386/pr115749.c: Use word_mode integerH.J. Lu1-1/+3
Use word_mode integer with func so that 64-bit integer is used with x32. * gcc.target/i386/pr115749.c (uword): New. (func): Replace unsigned long with uword. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12gcc.target/i386/invariant-ternlog-1.c: Also scan (%edx)H.J. Lu1-1/+1
Since x32 uses (%edx), instead of (%rdx), also scan (%edx). * gcc.target/i386/invariant-ternlog-1.c: Also scan (%edx). Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12libcpp, genmatch: Use gcc_diag instead of printf for libcpp diagnosticsJakub Jelinek120-1261/+1276
When working on #embed support, or -Wheader-guard or other recent libcpp changes, I've been annoyed by the libcpp diagnostics being visually different from normal gcc diagnostics, especially in the area of quoting stuff in the diagnostic messages. Normall GCC diagnostics is gcc_diag/gcc_tdiag, one can use %</%>, %qs etc. in there, while libcpp diagnostics was marked as printf and in libcpp we've been very creative with quoting stuff, either no quotes at all, or "something" quoting, or 'something' quoting, or `something' quoting (but in none of the cases it used colors consistently with the rest of the compiler). Now, libcpp diagnostics is always emitted using a callback, pfile->cb.diagnostic. On the gcc/ side, this callback is initialized with genmatch.cc: cb->diagnostic = diagnostic_cb; c-family/c-opts.cc: cb->diagnostic = c_cpp_diagnostic; fortran/cpp.cc: cb->diagnostic = cb_cpp_diagnostic; where the latter two just use diagnostic_report_diagnostic, so actually support all the gcc_diag stuff, only the genmatch.cc case didn't. So, the following patch changes genmatch.cc to use pp_format* instead of vfprintf so that it supports the gcc_diag formatting (pretty-print.o unfortunately has various dependencies, so had to link genmatch with libcommon.a libbacktrace.a and tweak Makefile.in so that there are no circular dependencies) and marks the libcpp diagnostic routines as gcc_diag rather than printf. That change resulted in hundreds of -Wformat-diag new warnings (most of them useful and resulting IMHO in better diagnostics), so the rest of the patch is changing the format strings to make -Wformat-diag happy and adjusting the testsuite for the differences in how is the diagnostic reformatted. Dunno if some out of GCC tree projects use libcpp, that case would make it harder because one couldn't use vfprintf in the diagnostic callback anymore, but there is always David's libdiagnostic which could be used for that purpose IMHO. 2024-10-12 Jakub Jelinek <jakub@redhat.com> libcpp/ * include/cpplib.h (ATTRIBUTE_CPP_PPDIAG): Define. (struct cpp_callbacks): Use ATTRIBUTE_CPP_PPDIAG instead of ATTRIBUTE_FPTR_PRINTF on diagnostic callback. (cpp_error, cpp_warning, cpp_pedwarning, cpp_warning_syshdr): Use ATTRIBUTE_CPP_PPDIAG (3, 4) instead of ATTRIBUTE_PRINTF_3. (cpp_warning_at, cpp_pedwarning_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5) instead of ATTRIBUTE_PRINTF_4. (cpp_error_with_line, cpp_warning_with_line, cpp_pedwarning_with_line, cpp_warning_with_line_syshdr): Use ATTRIBUTE_CPP_PPDIAG (5, 6) instead of ATTRIBUTE_PRINTF_5. (cpp_error_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5) instead of ATTRIBUTE_PRINTF_4. * Makefile.in (po/$(PACKAGE).pot): Use --language=GCC-source rather than --language=c. * errors.cc (cpp_diagnostic_at, cpp_diagnostic, cpp_diagnostic_with_line): Use ATTRIBUTE_CPP_PPDIAG instead of -ATTRIBUTE_FPTR_PRINTF. * charset.cc (cpp_host_to_exec_charset, _cpp_valid_ucn, convert_hex, convert_oct, convert_escape): Fix up -Wformat-diag warnings. (cpp_interpret_string_ranges, count_source_chars): Use ATTRIBUTE_CPP_PPDIAG instead of ATTRIBUTE_FPTR_PRINTF. (narrow_str_to_charconst): Fix up -Wformat-diag warnings. * directives.cc (check_eol_1, directive_diagnostics, lex_macro_node, do_undef, glue_header_name, parse_include, do_include_common, do_include_next, _cpp_parse_embed_params, do_embed, read_flag, do_line, do_linemarker, register_pragma_1, do_pragma_once, do_pragma_push_macro, do_pragma_pop_macro, do_pragma_poison, do_pragma_system_header, do_pragma_warning_or_error, _cpp_do__Pragma, do_else, do_elif, do_endif, parse_answer, do_assert, cpp_define_unused): Likewise. * expr.cc (cpp_classify_number, parse_defined, eval_token, _cpp_parse_expr, reduce, check_promotion): Likewise. * files.cc (_cpp_find_file, finish_base64_embed, _cpp_pop_file_buffer): Likewise. * init.cc (sanity_checks): Likewise. * lex.cc (_cpp_process_line_notes, maybe_warn_bidi_on_char, _cpp_warn_invalid_utf8, _cpp_skip_block_comment, warn_about_normalization, forms_identifier_p, maybe_va_opt_error, identifier_diagnostics_on_lex, cpp_maybe_module_directive): Likewise. * macro.cc (class vaopt_state, builtin_has_include_1, builtin_has_include, builtin_has_embed, _cpp_warn_if_unused_macro, _cpp_builtin_macro_text, builtin_macro, stringify_arg, _cpp_arguments_ok, collect_args, enter_macro_context, _cpp_save_parameter, parse_params, create_iso_definition, _cpp_create_definition, check_trad_stringification): Likewise. * pch.cc (cpp_valid_state): Likewise. * traditional.cc (_cpp_scan_out_logical_line, recursive_macro): Likewise. gcc/ * Makefile.in (generated_files): Remove {gimple,generic}-match*. (generated_match_files): New variable. Add a dependency of $(filter-out $(OBJS-libcommon),$(ALL_HOST_OBJS)) files on those. (build/genmatch$(build_exeext)): Depend on and link against libcommon.a and $(LIBBACKTRACE). * genmatch.cc: Include pretty-print.h and input.h. (ggc_internal_cleared_alloc, ggc_free): Remove. (fatal): New function. (line_table): Remove. (linemap_client_expand_location_to_spelling_point): Remove. (diagnostic_cb): Use gcc_diag rather than printf format. Use pp_format_verbatim on a temporary pretty_printer instead of vfprintf. (fatal_at, warning_at): Use gcc_diag rather than printf format. (output_line_directive): Rename location_hash to loc_hash. (parser::eat_ident, parser::parse_operation, parser::parse_expr, parser::parse_pattern, parser::finish_match_operand): Fix up -Wformat-diag warnings. gcc/c-family/ * c-lex.cc (c_common_has_attribute, c_common_lex_availability_macro): Fix up -Wformat-diag warnings. gcc/testsuite/ * c-c++-common/cpp/counter-2.c: Adjust expected diagnostics for libcpp diagnostic formatting changes. * c-c++-common/cpp/embed-3.c: Likewise. * c-c++-common/cpp/embed-4.c: Likewise. * c-c++-common/cpp/embed-16.c: Likewise. * c-c++-common/cpp/embed-18.c: Likewise. * c-c++-common/cpp/eof-2.c: Likewise. * c-c++-common/cpp/eof-3.c: Likewise. * c-c++-common/cpp/fmax-include-depth.c: Likewise. * c-c++-common/cpp/has-builtin.c: Likewise. * c-c++-common/cpp/line-2.c: Likewise. * c-c++-common/cpp/line-3.c: Likewise. * c-c++-common/cpp/macro-arg-count-1.c: Likewise. * c-c++-common/cpp/macro-arg-count-2.c: Likewise. * c-c++-common/cpp/macro-ranges.c: Likewise. * c-c++-common/cpp/named-universal-char-escape-4.c: Likewise. * c-c++-common/cpp/named-universal-char-escape-5.c: Likewise. * c-c++-common/cpp/pr88974.c: Likewise. * c-c++-common/cpp/va-opt-error.c: Likewise. * c-c++-common/cpp/va-opt-pedantic.c: Likewise. * c-c++-common/cpp/Wheader-guard-2.c: Likewise. * c-c++-common/cpp/Wheader-guard-3.c: Likewise. * c-c++-common/cpp/Winvalid-utf8-1.c: Likewise. * c-c++-common/cpp/Winvalid-utf8-2.c: Likewise. * c-c++-common/cpp/Winvalid-utf8-3.c: Likewise. * c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c: Likewise. * c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-3.c: Likewise. * c-c++-common/pr68833-3.c: Likewise. * c-c++-common/raw-string-directive-1.c: Likewise. * gcc.dg/analyzer/named-constants-Wunused-macros.c: Likewise. * gcc.dg/binary-constants-4.c: Likewise. * gcc.dg/builtin-redefine.c: Likewise. * gcc.dg/cpp/19951025-1.c: Likewise. * gcc.dg/cpp/c11-warning-1.c: Likewise. * gcc.dg/cpp/c11-warning-2.c: Likewise. * gcc.dg/cpp/c11-warning-3.c: Likewise. * gcc.dg/cpp/c23-elifdef-2.c: Likewise. * gcc.dg/cpp/c23-warning-2.c: Likewise. * gcc.dg/cpp/embed-2.c: Likewise. * gcc.dg/cpp/embed-3.c: Likewise. * gcc.dg/cpp/embed-4.c: Likewise. * gcc.dg/cpp/expr.c: Likewise. * gcc.dg/cpp/gnu11-elifdef-2.c: Likewise. * gcc.dg/cpp/gnu11-elifdef-3.c: Likewise. * gcc.dg/cpp/gnu11-elifdef-4.c: Likewise. * gcc.dg/cpp/gnu11-warning-1.c: Likewise. * gcc.dg/cpp/gnu11-warning-2.c: Likewise. * gcc.dg/cpp/gnu11-warning-3.c: Likewise. * gcc.dg/cpp/gnu23-warning-2.c: Likewise. * gcc.dg/cpp/include6.c: Likewise. * gcc.dg/cpp/pr35322.c: Likewise. * gcc.dg/cpp/tr-warn6.c: Likewise. * gcc.dg/cpp/undef2.c: Likewise. * gcc.dg/cpp/warn-comments.c: Likewise. * gcc.dg/cpp/warn-comments-2.c: Likewise. * gcc.dg/cpp/warn-comments-3.c: Likewise. * gcc.dg/cpp/warn-cxx-compat.c: Likewise. * gcc.dg/cpp/warn-cxx-compat-2.c: Likewise. * gcc.dg/cpp/warn-deprecated.c: Likewise. * gcc.dg/cpp/warn-deprecated-2.c: Likewise. * gcc.dg/cpp/warn-long-long.c: Likewise. * gcc.dg/cpp/warn-long-long-2.c: Likewise. * gcc.dg/cpp/warn-normalized-1.c: Likewise. * gcc.dg/cpp/warn-normalized-2.c: Likewise. * gcc.dg/cpp/warn-normalized-3.c: Likewise. * gcc.dg/cpp/warn-normalized-4-bytes.c: Likewise. * gcc.dg/cpp/warn-normalized-4-unicode.c: Likewise. * gcc.dg/cpp/warn-redefined.c: Likewise. * gcc.dg/cpp/warn-redefined-2.c: Likewise. * gcc.dg/cpp/warn-traditional.c: Likewise. * gcc.dg/cpp/warn-traditional-2.c: Likewise. * gcc.dg/cpp/warn-trigraphs-1.c: Likewise. * gcc.dg/cpp/warn-trigraphs-2.c: Likewise. * gcc.dg/cpp/warn-trigraphs-3.c: Likewise. * gcc.dg/cpp/warn-trigraphs-4.c: Likewise. * gcc.dg/cpp/warn-undef.c: Likewise. * gcc.dg/cpp/warn-undef-2.c: Likewise. * gcc.dg/cpp/warn-unused-macros.c: Likewise. * gcc.dg/cpp/warn-unused-macros-2.c: Likewise. * gcc.dg/pch/counter-2.c: Likewise. * g++.dg/cpp0x/udlit-error1.C: Likewise. * g++.dg/cpp23/named-universal-char-escape1.C: Likewise. * g++.dg/cpp23/named-universal-char-escape2.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-1.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-2.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-3.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-4.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-5.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-6.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-7.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-8.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-9.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-10.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-11.C: Likewise. * g++.dg/cpp23/Winvalid-utf8-12.C: Likewise. * g++.dg/cpp/elifdef-3.C: Likewise. * g++.dg/cpp/elifdef-5.C: Likewise. * g++.dg/cpp/elifdef-6.C: Likewise. * g++.dg/cpp/elifdef-7.C: Likewise. * g++.dg/cpp/embed-1.C: Likewise. * g++.dg/cpp/embed-2.C: Likewise. * g++.dg/cpp/pedantic-errors.C: Likewise. * g++.dg/cpp/warning-1.C: Likewise. * g++.dg/cpp/warning-2.C: Likewise. * g++.dg/ext/bitint1.C: Likewise. * g++.dg/ext/bitint2.C: Likewise.
2024-10-12Fortran: Unify gfc_get_location handling; fix expr->ts bugTobias Burnus5-27/+32
This commit reduces code duplication by moving gfc_get_location from trans.cc to error.cc. The gcc_assert is now used more often and reveald a bug in gfc_match_array_constructor where the union expr->ts.u.derived of a derived type is partially overwritten by the assignment expr->ts.u.cl->... as a ts.type == BT_CHARACTER check was missing. gcc/fortran/ChangeLog: * array.cc (gfc_match_array_constructor): Only update the character length if the expression is of character type. * error.cc (gfc_get_location_with_offset): New; split off from ... (gfc_format_decoder): ... here; call it. * gfortran.h (gfc_get_location_with_offset): New prototype. (gfc_get_location): New inline function. * trans.cc (gfc_get_location): Remove function definition. * trans.h (gfc_get_location): Remove declaration.
2024-10-12testsuite/i386: Add vector sat_sub testcases [PR112600]Uros Bizjak2-0/+50
PR middle-end/112600 gcc/testsuite/ChangeLog: * gcc.target/i386/pr112600-4a.c: New test. * gcc.target/i386/pr112600-4b.c: New test.
2024-10-12MAINTAINERS: Add myself to write after approvalFeng Xue1-0/+1
ChangeLog: * MAINTAINERS: Add myself to write after approval.
2024-10-12c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]Simon Martin8-33/+135
We currently emit an incorrect -Woverloaded-virtual warning upon the following test case === cut here === struct A { virtual operator int() { return 42; } virtual operator char() = 0; }; struct B : public A { operator char() { return 'A'; } }; === cut here === The problem is that when iterating over ovl_range (fns), warn_hidden gets confused by the conversion operator marker, concludes that seen_non_override is true and therefore emits a warning for all conversion operators in A that do not convert to char, even if -Woverloaded-virtual is 1 (e.g. with -Wall, the case reported). A second set of problems is highlighted when -Woverloaded-virtual is 2. First, with the same test case, since base_fndecls contains all conversion operators in A (except the one to char, that's been removed when iterating over ovl_range (fns)), we emit a spurious warning for the conversion operator to int, even though it's unrelated. Second, in case there are several conversion operators with different cv-qualifiers to the same type in A, we rightfully emit a warning, however the note uses the location of the conversion operator marker instead of the right one; location_of should go over conv_op_marker. This patch fixes all these by explicitly keeping track of (1) base methods that are overriden, as well as (2) base methods that are hidden but not overriden (and by what), and warning about methods that are in (2) but not (1). It also ignores non virtual base methods, per "definition" of -Woverloaded-virtual. PR c++/109918 gcc/cp/ChangeLog: * class.cc (warn_hidden): Keep track of overloaded and of hidden base methods. Mention the actual hiding function in the warning, not the first overload. * error.cc (location_of): Skip over conv_op_marker. gcc/testsuite/ChangeLog: * g++.dg/warn/Woverloaded-virt1.C: Check that no warning is emitted for non virtual base methods. * g++.dg/warn/Woverloaded-virt5.C: New test. * g++.dg/warn/Woverloaded-virt6.C: New test. * g++.dg/warn/Woverloaded-virt7.C: New test. * g++.dg/warn/Woverloaded-virt8.C: New test. * g++.dg/warn/Woverloaded-virt9.C: New test.
2024-10-12RISC-V: Add testcases for form 1 of vector signed SAT_SUBPan Li10-0/+393
Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus = (UT)x - (UT)y; \ out[i] = (x ^ y) >= 0 \ ? minus \ : (minus ^ x) >= 0 \ ? minus \ : x < 0 ? MIN : MAX; \ } \ } DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX) The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test data for run test. * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i8.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i32.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i64.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-12RISC-V: Implement vector SAT_SUB for signed integerPan Li3-0/+21
This patch would like to implement the sssub for vector signed integer. Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus = (UT)x - (UT)y; \ out[i] = (x ^ y) >= 0 \ ? minus \ : (minus ^ x) >= 0 \ ? minus \ : x < 0 ? MIN : MAX; \ } \ } DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX) Before this patch: 28 │ vle8.v v1,0(a1) 29 │ vle8.v v2,0(a2) 30 │ sub a3,a3,a5 31 │ add a1,a1,a5 32 │ add a2,a2,a5 33 │ vsra.vi v4,v1,7 34 │ vsub.vv v3,v1,v2 35 │ vxor.vv v2,v1,v2 36 │ vxor.vv v0,v1,v3 37 │ vmslt.vi v2,v2,0 38 │ vmslt.vi v0,v0,0 39 │ vmand.mm v0,v0,v2 40 │ vxor.vv v3,v4,v5,v0.t 41 │ vse8.v v3,0(a0) 42 │ add a0,a0,a5 After this patch: 25 │ vle8.v v1,0(a1) 26 │ vle8.v v2,0(a2) 27 │ sub a3,a3,a5 28 │ add a1,a1,a5 29 │ add a2,a2,a5 30 │ vssub.vv v1,v1,v2 31 │ vse8.v v1,0(a0) 32 │ add a0,a0,a5 The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/autovec.md (sssub<mode>3): Add new pattern for signed SAT_SUB. * config/riscv/riscv-protos.h (expand_vec_sssub): Add new func decl to expand sssub to vssub. * config/riscv/riscv-v.cc (expand_vec_sssub): Add new func impl to expand sssub to vssub. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-12Vect: Try the pattern of vector signed integer SAT_SUBPan Li1-1/+25
Almost the same as vector unsigned integer SAT_SUB, try to match the signed version during the vector pattern matching. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * tree-vect-patterns.cc (gimple_signed_integer_sat_sub): Add new func decl for signed SAT_SUB. (vect_recog_sat_sub_pattern_transform): Update comments. (vect_recog_sat_sub_pattern): Try the vector signed SAT_SUB pattern. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-12Match: Support form 1 for vector signed integer SAT_SUBPan Li1-0/+16
This patch would like to support the form 1 of the vector signed integer SAT_SUB. Aka below example: Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus = (UT)x - (UT)y; \ out[i] = (x ^ y) >= 0 \ ? minus \ : (minus ^ x) >= 0 \ ? minus \ : x < 0 ? MIN : MAX; \ } \ } DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX) Before this patch: 91 │ _108 = .SELECT_VL (ivtmp_106, POLY_INT_CST [16, 16]); 92 │ vect_x_16.11_80 = .MASK_LEN_LOAD (vectp_op_1.9_78, 8B, { -1, ... }, _108, 0); 93 │ _69 = vect_x_16.11_80 >> 7; 94 │ vect_x.12_81 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_x_16.11_80); 95 │ vect_y_18.15_85 = .MASK_LEN_LOAD (vectp_op_2.13_83, 8B, { -1, ... }, _108, 0); 96 │ vect__7.21_91 = vect_x_16.11_80 ^ vect_y_18.15_85; 97 │ mask__44.22_92 = vect__7.21_91 < { 0, ... }; 98 │ vect_y.16_86 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_y_18.15_85); 99 │ vect__6.17_87 = vect_x.12_81 - vect_y.16_86; 100 │ vect_minus_19.18_88 = VIEW_CONVERT_EXPR<vector([16,16]) signed char>(vect__6.17_87); 101 │ vect__8.19_89 = vect_x_16.11_80 ^ vect_minus_19.18_88; 102 │ mask__42.20_90 = vect__8.19_89 < { 0, ... }; 103 │ mask__41.23_93 = mask__42.20_90 & mask__44.22_92; 104 │ _4 = .COND_XOR (mask__41.23_93, _69, { 127, ... }, vect_minus_19.18_88); 105 │ .MASK_LEN_STORE (vectp_out.31_102, 8B, { -1, ... }, _108, 0, _4); 106 │ vectp_op_1.9_79 = vectp_op_1.9_78 + _108; 107 │ vectp_op_2.13_84 = vectp_op_2.13_83 + _108; 108 │ vectp_out.31_103 = vectp_out.31_102 + _108; 109 │ ivtmp_107 = ivtmp_106 - _108; After this patch: 81 │ _102 = .SELECT_VL (ivtmp_100, POLY_INT_CST [16, 16]); 82 │ vect_x_16.11_89 = .MASK_LEN_LOAD (vectp_op_1.9_87, 8B, { -1, ... }, _102, 0); 83 │ vect_y_18.14_93 = .MASK_LEN_LOAD (vectp_op_2.12_91, 8B, { -1, ... }, _102, 0); 84 │ vect_patt_38.15_94 = .SAT_SUB (vect_x_16.11_89, vect_y_18.14_93); 85 │ .MASK_LEN_STORE (vectp_out.16_96, 8B, { -1, ... }, _102, 0, vect_patt_38.15_94); 86 │ vectp_op_1.9_88 = vectp_op_1.9_87 + _102; 87 │ vectp_op_2.12_92 = vectp_op_2.12_91 + _102; 88 │ vectp_out.16_97 = vectp_out.16_96 + _102; 89 │ ivtmp_101 = ivtmp_100 - _102; The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add case 1 matching pattern for vector signed SAT_SUB. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-12Daily bump.GCC Administrator6-1/+346
2024-10-11Introduce GFC_STD_UNSIGNED.Thomas Koenig3-10/+16
This patch creates an unsigned "standard" for the gfc_option.allow_std field. One of the main reason why people want UNSIGNED for Fortran is interfacing for C. This is a preparation for further work on the ISO_C_BINDING constants. That, we do via iso-c-binding.def , whose last field is a standard for the constant to be defined for the standard in question, which is then checked. I could try and invent a different method for this, but I'd rather not. gcc/fortran/ChangeLog: * intrinsic.cc (add_functions): Convert uint and selected_unsigned_kind to GFC_STD_UNSIGNED. (gfc_check_intrinsic_standard): Handle GFC_STD_UNSIGNED. * libgfortran.h (GFC_STD_UNSIGNED): Add. * options.cc (gfc_post_options): Set GFC_STD_UNSIGNED if -funsigned is set.
2024-10-12gcc.target/i386: Replace long with long longH.J. Lu5-8/+9
Since long is 64-bit for x32, replace long with long long for x32. * gcc.target/i386/bmi2-pr112526.c: Replace long with long long. * gcc.target/i386/pr105854.c: Likewise. * gcc.target/i386/pr112943.c: Likewise. * gcc.target/i386/pr67325.c: Likewise. * gcc.target/i386/pr97971.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12g++.target/i386/pr105953.C: Skip for x32H.J. Lu1-1/+1
Since -mabi=ms isn't supported for x32, skip g++.target/i386/pr105953.C for x32. * g++.target/i386/pr105953.C: Skip for x32. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-12gcc.target/i386/pr115407.c: Only run for lp64H.J. Lu1-1/+1
Since -mcmodel=large is valid only for lp64, run pr115407.c only for lp64. * gcc.target/i386/pr115407.c: Only run for lp64. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-11Fix thinko in previous changeEric Botcazou1-1/+1
gcc/ada/ PR ada/116498 PR ada/117087 * gcc-interface/decl.cc (validate_size): Fix thinko.
2024-10-11libstdc++: Rearrange std::move_iterator helpers in stl_iterator.hJonathan Wakely1-32/+31
The __niter_base(move_iterator<I>) overload and __is_move_iterator trait were originally immediately after the definition of move_iterator. The addition of C++20 features after move_iterator meant that those helpers were no longer anywhere near move_iterator. This change puts them back where they used to be, before all the new C++20 additions. libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (__niter_base(move_iterator<I>)) (__is_move_iterator, __miter_base, _GLIBCXX_MAKE_MOVE_ITERATOR) (_GLIBCXX_MAKE_MOVE_IF_NOEXCEPT_ITERATOR): Move earlier in the file.
2024-10-11PR target/117048 aarch64: Use more canonical and optimization-friendly ↵Kyrylo Tkachov2-4/+63
representation for XAR instruction The pattern for the Advanced SIMD XAR instruction isn't very optimization-friendly at the moment. In the testcase from the PR once simlify-rtx has done its work it generates the RTL: (set (reg:V2DI 119 [ _14 ]) (rotate:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ]) (reg:V2DI 116 [ *m1_01_8(D) ])) (const_vector:V2DI [ (const_int 32 [0x20]) repeated x2 ]))) which fails to match our XAR pattern because the pattern expects: 1) A ROTATERT instead of the ROTATE. However, according to the RTL ops documentation the preferred form of rotate-by-immediate is ROTATE, which I take to mean it's the canonical form. ROTATE (x, C) <-> ROTATERT (x, MODE_WIDTH - C) so it's better to match just one canonical representation. 2) A CONST_INT shift amount whereas the midend asks for a repeated vector constant. These issues are fixed by introducing a dedicated expander for the aarch64_xarqv2di name, needed by the arm_neon.h intrinsic, that translate the intrinsic-level CONST_INT immediate (the right-rotate amount) into a repeated vector constant subtracted from 64 to give the corresponding left-rotate amount that is fed to the new representation for the XAR define_insn that uses the ROTATE RTL code. This is a similar approach to have we handle the discrepancy between intrinsic-level and RTL-level vector lane numbers for big-endian. With this patch and [1/2] the arithmetic parts of the testcase now simplify to just one XAR instruction. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ PR target/117048 * config/aarch64/aarch64-simd.md (aarch64_xarqv2di): Redefine into a define_expand. (*aarch64_xarqv2di_insn): Define. gcc/testsuite/ PR target/117048 * g++.target/aarch64/pr117048.C: New test.
2024-10-11PR 117048: simplify-rtx: Extend (x << C1) | (X >> C2) --> ROTATE ↵Kyrylo Tkachov1-6/+10
transformation to vector operands In the testcase from patch [2/2] we want to match a vector rotate operate from an IOR of left and right shifts by immediate. simplify-rtx has code for just that but it looks like it's prepared to do handle only scalar operands. In practice most of the code works for vector modes as well except the shift amounts are checked to be CONST_INT rather than vector constants that we have here. This is easily extended by using unwrap_const_vec_duplicate to extract the repeating constant shift amount. With this change combine now tries matching the simpler and expected: (set (reg:V2DI 119 [ _14 ]) (rotate:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ]) (reg:V2DI 116 [ *m1_01_8(D) ])) (const_vector:V2DI [ (const_int 32 [0x20]) repeated x2 ]))) instead of the previous: (set (reg:V2DI 119 [ _14 ]) (ior:V2DI (ashift:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ]) (reg:V2DI 116 [ *m1_01_8(D) ])) (const_vector:V2DI [ (const_int 32 [0x20]) repeated x2 ])) (lshiftrt:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ]) (reg:V2DI 116 [ *m1_01_8(D) ])) (const_vector:V2DI [ (const_int 32 [0x20]) repeated x2 ])))) To actually fix the PR the aarch64 backend needs some adjustment as well which is done in patch [2/2], which adds the testcase as well. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> PR target/117048 * simplify-rtx.cc (simplify_context::simplify_binary_operation_1): Handle vector constants in (x << C1) | (x >> C2) -> ROTATE simplification.
2024-10-11Fortran: Dead-function removal in error.cc (shrinking by 40%)Tobias Burnus1-717/+0
This patch removes a large number of unused static functions from error.cc, which previously were used for diagnostic but have been replaced by the common diagnostic code. gcc/fortran/ChangeLog: * error.cc (error_char, error_string, error_uinteger, error_integer, error_hwuint, error_hwint, gfc_widechar_display_length, gfc_wide_display_length, error_printf, show_locus, show_loci): Remove unused static functions. (IBUF_LEN, MAX_ARGS): Remove now unused #define.
2024-10-11match.pd: Fold logarithmic identities.Jennifer Schmitz2-0/+81
This patch implements 4 rules for logarithmic identities in match.pd under -funsafe-math-optimizations: 1) logN(1.0/a) -> -logN(a). This avoids the division instruction. 2) logN(C/a) -> logN(C) - logN(a), where C is a real constant. Same as 1). 3) logN(a) + logN(b) -> logN(a*b). This reduces the number of calls to log function. 4) logN(a) - logN(b) -> logN(a/b). Same as 4). Tests were added for float, double, and long double. The patch was bootstrapped and regtested on aarch64-linux-gnu and x86_64-linux-gnu, no regression. Additionally, SPEC 2017 fprate was run. While the transform does not seem to be triggered, we also see no non-noise impact on performance. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ PR tree-optimization/116826 PR tree-optimization/86710 * match.pd: Fold logN(1.0/a) -> -logN(a), logN(C/a) -> logN(C) - logN(a), logN(a) + logN(b) -> logN(a*b), and logN(a) - logN(b) -> logN(a/b). gcc/testsuite/ PR tree-optimization/116826 PR tree-optimization/86710 * gcc.dg/tree-ssa/log_ident.c: New test.
2024-10-11libstdc++: Use appropriate feature test macro for std::byteJonathan Wakely1-1/+1
libstdc++-v3/ChangeLog: * include/bits/cpp_type_traits.h (__is_byte<byte>): Guard with __glibcxx_byte macro instead of checking __cplusplus.
2024-10-11libstdc++: Fix localized %c formatting for <chrono> [PR117085]Jonathan Wakely4-6/+27
When formatting a time point with %c we call std::vformat_to using the formatting locale's D_T_FMT string, but we weren't adding the L option to the format string. This meant we always interpreted D_T_FMT in the C locale, instead of using the formatting locale as obviously intended when %c is used. libstdc++-v3/ChangeLog: PR libstdc++/117085 * include/bits/chrono_io.h (__formatter_chrono::_M_c): Add L option to format string. * testsuite/std/time/format.cc: Move to... * testsuite/std/time/format/format.cc: ...here. * testsuite/std/time/format_localized.cc: Move to... * testsuite/std/time/format/localized.cc: ...here. * testsuite/std/time/format/pr117085.cc: New test.
2024-10-11libstdc++: Add missing whitespace in dg-do directivesJonathan Wakely2-2/+2
libstdc++-v3/ChangeLog: * testsuite/22_locale/time_get/get/char/5.cc: Fix dg-do directive. * testsuite/22_locale/time_get/get/wchar_t/5.cc: Likewise.
2024-10-11tree-optimization/117080 - Add SLP_TREE_MEMORY_ACCESS_TYPERichard Biener4-50/+91
It turns out target costing code looks at STMT_VINFO_MEMORY_ACCESS_TYPE to identify operations from (emulated) gathers for example. This doesn't work for SLP loads since we do not set STMT_VINFO_MEMORY_ACCESS_TYPE there as the vectorization strathegy might differ between different stmt uses. It seems we got away with setting it for stores though. The following adds a memory_access_type field to slp_tree and sets it from load and store vectorization code. All the costing doesn't record the SLP node (that was only done selectively for some corner case). The costing is really in need of a big overhaul, the following just massages the two relevant ops to fix gcc.dg/target/pr88531-2[bc].c FAILs when switching on SLP for non-grouped stores. In particular currently we either have a SLP node or a stmt_info in the cost hook but not both. So the following mitigates this, postponing a rewrite of costing to next stage1. Other targets look possibly affected as well but are left to respective maintainers to update. PR tree-optimization/117080 * tree-vectorizer.h (_slp_tree::memory_access_type): Add. (SLP_TREE_MEMORY_ACCESS_TYPE): New. (record_stmt_cost): Add another overload. * tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize memory_access_type. * tree-vect-stmts.cc (vectorizable_store): Set SLP_TREE_MEMORY_ACCESS_TYPE. (vectorizable_load): Likewise. Also record the SLP node when costing emulated gather offset decompose and vector composition. * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Also recognize SLP emulated gather/scatter.
2024-10-11aarch64: Add codegen support for SVE2 faminmaxSaurabh Jha4-0/+147
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_COND_SMAX of the two operands and then taking absolute value of their result. 2. famin can be expressed as taking UNSPEC_COND_SMIN of the two operands and then taking absolute value of their result. This fusion of operators is only possible when -march=armv9-a+faminmax+sve flags are passed. We also need to pass -ffast-math flag; this is what enables compiler to use UNSPEC_COND_SMAX and UNSPEC_COND_SMIN. This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (*aarch64_pred_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Iterator and attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/faminmax_1.c: New test. * gcc.target/aarch64/sve/faminmax_2.c: New test.
2024-10-11aarch64: Add SVE2 faminmax intrinsicsSaurabh Jha11-1/+2615
The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max|min]_[m|x|z] * sva[max|min]_[f16|f32|f64]_[m|x|z] * sva[max|min]_n_[f16|f32|f64]_[m|x|z] gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.def (REQUIRED_EXTENSIONS): Add faminmax intrinsics behind a flag. (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.h: Declaring function bases for the new intrinsics. * config/aarch64/aarch64.h (TARGET_SVE_FAMINMAX): New flag for SVE2 faminmax. * config/aarch64/iterators.md: New unspecs, iterators, and attrs for the new intrinsics. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve2/acle/asm/amax_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f64.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f64.c: New test.
2024-10-11middle-end/117086 - fixup vec_cond simplificationsRichard Biener2-21/+36
The following adds missing checks for a vector type result type to simplifications that end up creating a vec_cond. PR middle-end/117086 * match.pd ((op (vec_cond ...) ..) -> (vec_cond ...)): Add missing checks for VECTOR_TYPE_P (type). * gcc.dg/torture/pr117086.c: New testcase.
2024-10-11RISC-V: Add testcases for form 8 of scalar signed SAT_TRUNCPan Li13-0/+271
Form 8: #define DEF_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_8 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN > x || x >= (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_s_trunc-8-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-8-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-8-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-8-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-8-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-8-i64-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-8-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-8-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-8-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-8-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-8-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-run-8-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11RISC-V: Add testcases for form 7 of scalar signed SAT_TRUNCPan Li13-0/+271
Form 7: #define DEF_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_7 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN >= x || x >= (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_s_trunc-7-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-7-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-7-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-7-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-7-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-7-i64-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-7-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-7-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-7-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-7-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-7-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-run-7-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11RISC-V: Add testcases for form 6 of scalar signed SAT_TRUNCPan Li13-0/+271
Form 6: #define DEF_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_6 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN >= x || x > (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_s_trunc-6-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-6-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-6-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-6-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-6-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-6-i64-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-6-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-6-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-6-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-6-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-6-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-run-6-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11RISC-V: Add testcases for form 5 of scalar signed SAT_TRUNCPan Li13-0/+271
Form 5: #define DEF_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_5 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN > x || x > (WT)NT_MAX \ ? x < 0 ? NT_MIN : NT_MAX \ : trunc; \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_s_trunc-5-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-5-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-5-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-5-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-5-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-5-i64-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-5-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-5-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-5-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-5-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-5-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-run-5-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11RISC-V: Add testcases for form 4 of scalar signed SAT_TRUNCPan Li13-0/+271
Form 4: #define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN <= x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_s_trunc-4-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-4-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-4-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-4-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-4-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-4-i64-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-4-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-4-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-4-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-4-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-4-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-run-4-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11Match: Support form 4 for scalar signed integer SAT_TRUNCPan Li1-0/+1
This patch would like to support the form 4 of the scalar signed integer SAT_TRUNC. Aka below example: Form 4: #define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN <= x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } DEF_SAT_S_TRUNC_FMT_4(int8_t, int16_t, INT8_MIN, INT8_MAX) Before this patch: 4 │ __attribute__((noinline)) 5 │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_4 (int16_t x) 6 │ { 7 │ int8_t trunc; 8 │ unsigned short x.0_1; 9 │ unsigned short _2; 10 │ int8_t _3; 11 │ _Bool _7; 12 │ signed char _8; 13 │ signed char _9; 14 │ signed char _10; 15 │ 16 │ ;; basic block 2, loop depth 0 17 │ ;; pred: ENTRY 18 │ x.0_1 = (unsigned short) x_4(D); 19 │ _2 = x.0_1 + 128; 20 │ if (_2 > 254) 21 │ goto <bb 4>; [50.00%] 22 │ else 23 │ goto <bb 3>; [50.00%] 24 │ ;; succ: 4 25 │ ;; 3 26 │ 27 │ ;; basic block 3, loop depth 0 28 │ ;; pred: 2 29 │ trunc_5 = (int8_t) x_4(D); 30 │ goto <bb 5>; [100.00%] 31 │ ;; succ: 5 32 │ 33 │ ;; basic block 4, loop depth 0 34 │ ;; pred: 2 35 │ _7 = x_4(D) < 0; 36 │ _8 = (signed char) _7; 37 │ _9 = -_8; 38 │ _10 = _9 ^ 127; 39 │ ;; succ: 5 40 │ 41 │ ;; basic block 5, loop depth 0 42 │ ;; pred: 3 43 │ ;; 4 44 │ # _3 = PHI <trunc_5(3), _10(4)> 45 │ return _3; 46 │ ;; succ: EXIT 47 │ 48 │ } After this patch: 4 │ __attribute__((noinline)) 5 │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_4 (int16_t x) 6 │ { 7 │ int8_t _3; 8 │ 9 │ ;; basic block 2, loop depth 0 10 │ ;; pred: ENTRY 11 │ _3 = .SAT_TRUNC (x_4(D)); [tail call] 12 │ return _3; 13 │ ;; succ: EXIT 14 │ 15 │ } The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add case 4 matching pattern for signed SAT_TRUNC. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11RISC-V: Add testcases for form 3 of scalar signed SAT_TRUNCPan Li13-0/+271
Form 3: #define DEF_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_3 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN < x && x <= (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_s_trunc-3-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-3-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-3-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-3-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-3-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-3-i64-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-3-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-3-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-3-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-3-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-3-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-run-3-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11Match: Support form 3 for scalar signed integer SAT_TRUNCPan Li1-0/+3
This patch would like to support the form 3 of the scalar signed integer SAT_TRUNC. Aka below example: Form 3: #define DEF_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_3 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN < x && x <= (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } DEF_SAT_S_TRUNC_FMT_3(int8_t, int16_t, INT8_MIN, INT8_MAX) Before this patch: 4 │ __attribute__((noinline)) 5 │ int8_t sat_s_sub_int8_t_fmt_3 (int8_t x, int8_t y) 6 │ { 7 │ signed char _1; 8 │ signed char _2; 9 │ int8_t _3; 10 │ __complex__ signed char _6; 11 │ _Bool _8; 12 │ signed char _9; 13 │ signed char _10; 14 │ signed char _11; 15 │ 16 │ ;; basic block 2, loop depth 0 17 │ ;; pred: ENTRY 18 │ _6 = .SUB_OVERFLOW (x_4(D), y_5(D)); 19 │ _2 = IMAGPART_EXPR <_6>; 20 │ if (_2 != 0) 21 │ goto <bb 4>; [50.00%] 22 │ else 23 │ goto <bb 3>; [50.00%] 24 │ ;; succ: 4 25 │ ;; 3 26 │ 27 │ ;; basic block 3, loop depth 0 28 │ ;; pred: 2 29 │ _1 = REALPART_EXPR <_6>; 30 │ goto <bb 5>; [100.00%] 31 │ ;; succ: 5 32 │ 33 │ ;; basic block 4, loop depth 0 34 │ ;; pred: 2 35 │ _8 = x_4(D) < 0; 36 │ _9 = (signed char) _8; 37 │ _10 = -_9; 38 │ _11 = _10 ^ 127; 39 │ ;; succ: 5 40 │ 41 │ ;; basic block 5, loop depth 0 42 │ ;; pred: 3 43 │ ;; 4 44 │ # _3 = PHI <_1(3), _11(4)> 45 │ return _3; 46 │ ;; succ: EXIT 47 │ 48 │ } After this patch: 4 │ __attribute__((noinline)) 5 │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_3 (int16_t x) 6 │ { 7 │ int8_t _3; 8 │ 9 │ ;; basic block 2, loop depth 0 10 │ ;; pred: ENTRY 11 │ _3 = .SAT_TRUNC (x_4(D)); [tail call] 12 │ return _3; 13 │ ;; succ: EXIT 14 │ 15 │ } The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add case 3 matching pattern for signed SAT_TRUNC. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11RISC-V: Add testcases for form 2 of scalar signed SAT_TRUNCPan Li13-0/+271
Form 2: #define DEF_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_2 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN < x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } The below test are passed for this patch. * The rv64gcv fully regression test. It is test only patch and obvious up to a point, will commit it directly if no comments in next 48H. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_s_trunc-2-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-2-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-2-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-2-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-2-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-2-i64-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-2-i16-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-2-i32-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-2-i32-to-i8.c: New test. * gcc.target/riscv/sat_s_trunc-run-2-i64-to-i16.c: New test. * gcc.target/riscv/sat_s_trunc-run-2-i64-to-i32.c: New test. * gcc.target/riscv/sat_s_trunc-run-2-i64-to-i8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11Match: Support form 2 for scalar signed integer SAT_TRUNCPan Li1-8/+13
This patch would like to support the form 2 of the scalar signed integer SAT_TRUNC. Aka below example: Form 2: #define DEF_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \ NT __attribute__((noinline)) \ sat_s_trunc_##WT##_to_##NT##_fmt_2 (WT x) \ { \ NT trunc = (NT)x; \ return (WT)NT_MIN < x && x < (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } DEF_SAT_S_TRUNC_FMT_2(int8_t, int16_t, INT8_MIN, INT8_MAX) Before this patch: 4 │ __attribute__((noinline)) 5 │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_2 (int16_t x) 6 │ { 7 │ int8_t trunc; 8 │ unsigned short x.0_1; 9 │ unsigned short _2; 10 │ int8_t _3; 11 │ _Bool _7; 12 │ signed char _8; 13 │ signed char _9; 14 │ signed char _10; 15 │ 16 │ ;; basic block 2, loop depth 0 17 │ ;; pred: ENTRY 18 │ x.0_1 = (unsigned short) x_4(D); 19 │ _2 = x.0_1 + 127; 20 │ if (_2 > 253) 21 │ goto <bb 4>; [50.00%] 22 │ else 23 │ goto <bb 3>; [50.00%] 24 │ ;; succ: 4 25 │ ;; 3 26 │ 27 │ ;; basic block 3, loop depth 0 28 │ ;; pred: 2 29 │ trunc_5 = (int8_t) x_4(D); 30 │ goto <bb 5>; [100.00%] 31 │ ;; succ: 5 32 │ 33 │ ;; basic block 4, loop depth 0 34 │ ;; pred: 2 35 │ _7 = x_4(D) < 0; 36 │ _8 = (signed char) _7; 37 │ _9 = -_8; 38 │ _10 = _9 ^ 127; 39 │ ;; succ: 5 40 │ 41 │ ;; basic block 5, loop depth 0 42 │ ;; pred: 3 43 │ ;; 4 44 │ # _3 = PHI <trunc_5(3), _10(4)> 45 │ return _3; 46 │ ;; succ: EXIT 47 │ 48 │ } After this patch: 4 │ __attribute__((noinline)) 5 │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_2 (int16_t x) 6 │ { 7 │ int8_t _3; 8 │ 9 │ ;; basic block 2, loop depth 0 10 │ ;; pred: ENTRY 11 │ _3 = .SAT_TRUNC (x_4(D)); [tail call] 12 │ return _3; 13 │ ;; succ: EXIT 14 │ 15 │ } The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add case 2 matching pattern for signed SAT_TRUNC. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-11i386: Fix up spaceship expanders for -mtune=i[45]86 [PR117053]Jakub Jelinek2-38/+113
The adjusted and new spaceship expanders ICE with -mtune=i486 or -mtune=i586. The problem is that in that case TARGET_ZERO_EXTEND_WITH_AND is true and zero_extendqisi2 isn't allowed in that case, and we can't use the replacement AND, because that clobbers flags and we want to use them again. The following patch fixes that by using in those cases roughly what we want to expand it to after peephole2 optimizations, i.e. xor before the comparison, *setcc_qi_slp and sbbl $0 (or for signed int case xoring of 2 regs, two *setcc_qi_slp, subl). For *setcc_qi_slp, it uses the setcc_si_slp hacks with UNSPEC that were in use for the floating point jp case (so such code is IMHO undesirable for the !TARGET_ZERO_EXTEND_WITH_AND case as we want to give combiner more liberty in that case). 2024-10-11 Jakub Jelinek <jakub@redhat.com> PR target/117053 * config/i386/i386-expand.cc (ix86_expand_fp_spaceship): Handle TARGET_ZERO_EXTEND_WITH_AND differently. (ix86_expand_int_spaceship): Likewise. * g++.target/i386/pr116896-3.C: New test.
2024-10-11tree-optimization/117050 - fix ICE with non-grouped .MASK_LOAD SLPRichard Biener2-1/+20
The following temporarily reverts the support of permuted .MASK_LOAD for the case of non-grouped accesses. PR tree-optimization/117050 * tree-vect-slp.cc (vect_build_slp_tree_2): Do not support permutes of non-grouped .MASK_LOAD. * gcc.dg/vect/pr117050.c: New testcase.
2024-10-11libstdc++: Fix some test failures with -fno-char8_tJonathan Wakely2-2/+9
libstdc++-v3/ChangeLog: * testsuite/20_util/duration/io.cc [!__cpp_lib_char8_t]: Define char8_t as a typedef for unsigned char. * testsuite/std/format/parse_ctx_neg.cc: Skip for -fno-char8_t.
2024-10-11Fix possible wrong-code with masked store-lanesRichard Biener1-10/+20
When we're doing masked store-lanes one mask element applies to all loads of one struct element. This requires uniform masks for all of the SLP lanes, something we already compute into STMT_VINFO_SLP_VECT_ONLY but fail to check when doing SLP store-lanes. The following corrects this. The following also adjusts the store-lane heuristic to properly check for masked or non-masked optab support. * tree-vect-slp.cc (vect_slp_prefer_store_lanes_p): Allow passing in of vectype, pass in whether the stores are masked and query the correct optab. (vect_build_slp_instance): Guard store-lanes query with ! STMT_VINFO_SLP_VECT_ONLY, guaranteeing an uniform mask.
2024-10-11i386: Fix some patterns's mem attribute.Hu, Lin11-10/+12
Hi, all This is another patch to modify some pattern's type attr from ssemov to ssemov2. Some ssemov pattern's mem attr should be load when their 2 operand is a memory operand. Bootstrapped and regtested on x86-64-linux-pc, OK for trunk? BRs, Lin gcc/ChangeLog: * config/i386/sse.md (sse_movhlps): Change type attr from ssemov to ssemov2. (sse_loadhps): Ditto. (*vec_concat<mode>): Ditto. (vec_setv2df_0): Ditto. (sse_loadlps): Change attr from ssemov to ssemov2 except for 2, 3. (sse2_loadhps): Change attr from ssemov to ssemov2 except for 0, 1. (sse2_loadlpd): Change attr from ssemov to ssemov2 except for 0, 1, 2. (sse2_movsd_<mode>): Change attr from ssemov to ssemov2 except for 5. (vec_concatv2df): Change attr from ssemov to ssemov2 except for 0, 1, 2. (*vec_concat<mode>): Change attr from ssemov to ssemov2 for 3, 4. (vec_concatv2di): Change attr from ssemov to ssemov2 except for 0, 1, 2, 3, 4, 5.
2024-10-11Daily bump.GCC Administrator5-1/+153
2024-10-10aarch64: Alter pr116258.c test to correct for big endian.Richard Ball1-1/+2
The test at pr116258.c fails on big endian targets, this is because the test checks that the index of a floating point multiply is 0, which is correct only for little endian. gcc/testsuite/ChangeLog: PR tree-optimization/116258 * gcc.target/aarch64/pr116258.c: Alter test to add big-endian support.
2024-10-10Fix PR116650: check all regs in regrename targetsMichael Matz1-6/+19
(this came up for m68k vs. LRA, but is a generic problem) Regrename wants to use new registers for certain def-use chains. For validity of replacements it needs to check that the selected candidates are unused up to then. That's done in check_new_reg_p. But if it so happens that the new register needs more hardregs than the old register (which happens if the target allows inter-bank moves and the mode is something like a DFmode that needs to be placed into a SImode reg-pair), then check_new_reg_p only checks the first of those registers for free-ness. This is caused by that function looking up the number of necessary hardregs only in terms of the old hardreg number. It of course needs to do that in terms of the new candidate regnumber. The symptom is that regrename sometimes clobbers the higher numbered registers of such a regrename target pair. This patch fixes that problem. (In the particular case of the bug report it was LRA that left over a inter-bank move instruction that triggers regrename, ultimately causing the mis-compile. Reload didn't do that, but in general we of course can't rely on such moves not happening if the target allows them.) This also shows a general confusion in that function and the target hook interface here: for (i = nregs - 1; i >= 0; --) ... || ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i)) it uses nregs in a way that requires it to be the same between old and new register. The problem is that the target hook only gets register numbers, when it instead should get a mode and register numbers and would be called only for the first but not for subsequent registers. I've looked at a number of definitions of that target hook and I think that this is currently harmless in the sense that it would merely rule out some potential reg-renames that would in fact be okay to do. So I'm not changing the target hook interface here and hence that problem remains unfixed. PR rtl-optimization/116650 * regrename.cc (check_new_reg_p): Calculate nregs in terms of the new candidate register.