aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-10-19Fortran: Fix translatability of diagnostic stringsTobias Burnus5-26/+26
gcc/fortran/ChangeLog: * check.cc (is_c_interoperable): Use _(...) around to mark strings as translatable. * data.cc (gfc_assign_data_value): Move string literal to gfc_error to make it translatable. * resolve.cc (resolve_fl_variable, resolve_equivalence): Use G_(...) around string literals. * scanner.cc (skip_fixed_omp_sentinel): Replace '...' by %<...%>. * trans-openmp.cc (gfc_split_omp_clauses, gfc_trans_omp_declare_variant): Likewise.
2024-10-19Fortran: Add range-based diagnosticTobias Burnus30-306/+343
GCC's diagnostic engine gained a while ago support for ranges, i.e. instead of pointing at a single character '^', it can also have a '~~~~^~~~~~' range. This patch adds support for this and adds 9 users for it, which covers the most common cases. A single '^' can be still useful. Some location data in gfortran is rather bad - often the matching pattern includes whitespace such that the before or after location points to the beginning/end of the whitespace, which can be far of especially when comments and/or continuation lines are involed. Otherwise, often a '^' still sufficient, albeit wrong location data only becomes obvious once starting to use ranges. The 'locus' is extended to support two ways to store the data; hereby gfc_current_locus always contains the old format (at least during parsing) and gfc_current_locus shall not be used in trans*.cc. The latter permits a nice cleanup to just use input_location. Otherwise, the new format is only used when switching to ranges. The only reason to convert from location_t to locus occurs in trans*.cc for the gfc_error (etc.) diagnostic and for gfc_trans_runtime_check; there are 5 currently 5 such cases. For gfc_* diagnostic, we could think of another letter besides %L or a modifier like '%lL', if deemed useful. In any case, the new format is just: locus->u.location = linemap_position_for_loc_and_offset (line_table, loc->u.lb->location, loc->nextc - loc->u.lb->line); locus->nextc = (gfc_char_t *) -1; /* Marker for new format. */ i.e. using the existing location_t location in in the linebuffer (which points to column 0) and add as offset the actually used column number. As location_t handles ranges, we just use it also to store them via: location = make_location (caret, begin, end) There are a few convenience macros/functions but that's all. Alongside, a few minor fixes were done: linemap_location_before_p replaces a line-number based comparison, which does not handle multiple statements in the same line that ';' allows for. gcc/fortran/ChangeLog: * data.cc (gfc_assign_data_value): Use linemap_location_before_p and GFC_LOCUS_IS_SET. * decl.cc (gfc_verify_c_interop_param): Make better translatable. (build_sym, variable_decl, gfc_match_formal_arglist, gfc_match_subroutine): Add range-based locations, use it in diagnostic and gobble whitespace for better locations. * error.cc (gfc_get_location_with_offset): Handle new format. (gfc_get_location_range): New. * expr.cc (gfc_check_assign): Use GFC_LOCUS_IS_SET. * frontend-passes.cc (check_locus_code, check_locus_expr): Likewise. (runtime_error_ne): Use GFC_LOCUS_IS_SET. * gfortran.h (locus): Change lb to union with lb and location. (GFC_LOCUS_IS_SET): Define. (gfc_get_location_range): New prototype. (gfc_new_symbol, gfc_get_symbol, gfc_get_sym_tree, gfc_get_ha_symbol, gfc_get_ha_sym_tree): Take optional locus argument. * io.cc (io_constraint): Use GFC_LOCUS_IS_SET. * match.cc (gfc_match_sym_tree): Use range locus. * openmp.cc (gfc_match_omp_variable_list, gfc_match_omp_doacross_sink): Likewise. * parse.cc (next_free): Update for locus struct change. * primary.cc (gfc_match_varspec): Likewise. (match_variable): Use range locus. * resolve.cc (find_array_spec): Use GFC_LOCUS_IS_SET. * scanner.cc (gfc_at_eof, gfc_at_bol, gfc_start_source_files, gfc_advance_line, gfc_define_undef_line, skip_fixed_comments, gfc_gobble_whitespace, include_stmt, gfc_new_file): Update for locus struct change. * symbol.cc (gfc_new_symbol, gfc_get_sym_tree, gfc_get_symbol, gfc_get_ha_sym_tree, gfc_get_ha_symbol): Take optional locus. * trans-array.cc (gfc_trans_array_constructor_value): Use %L not %C. (gfc_trans_g77_array, gfc_trans_dummy_array_bias, gfc_trans_class_array, gfc_trans_deferred_array): Replace gfc_{save,set,restore}_backend_locus by directly using input_location. * trans-common.cc (build_equiv_decl, get_init_field): Likewise. * trans-decl.cc (gfc_get_extern_function_decl, build_function_decl, build_entry_thunks, gfc_null_and_pass_deferred_len, gfc_trans_deferred_vars, gfc_trans_use_stmts, finish_oacc_declare, gfc_generate_block_data): Likewise. * trans-expr.cc (gfc_copy_class_to_class, gfc_conv_expr): Changes to avoid gfc_current_locus. * trans-io.cc (set_error_locus): Likewise. * trans-openmp.cc (gfc_trans_omp_workshare): Use input_locus directly. * trans-stmt.cc (gfc_trans_if_1): Likewise and use GFC_LOCUS_IS_SET. * trans-types.cc (gfc_get_union_type, gfc_get_derived_type): Likewise. * trans.cc (gfc_locus_from_location): New. (trans_runtime_error_vararg, gfc_trans_runtime_check): Use location_t for file + line data. (gfc_current_backend_file, gfc_save_backend_locus, gfc_set_backend_locus, gfc_restore_backend_locus): Remove. (trans_code): Use input_location directly, don't set gfc_current_locus. * trans.h (gfc_save_backend_locus, gfc_set_backend_locus, gfc_restore_backend_locus): Remove prototypes. (gfc_locus_from_location): Add prototype. gcc/testsuite/ChangeLog: * gfortran.dg/bounds_check_25.f90: Update expected column in the diagnostic. * gfortran.dg/goacc/pr92793-1.f90: Likewise. * gfortran.dg/gomp/allocate-14.f90: Likewise. * gfortran.dg/gomp/polymorphic-mapping.f90: Likewise. * gfortran.dg/gomp/reduction5.f90: Likewise. * gfortran.dg/gomp/reduction6.f90: Likewise.
2024-10-19Fix an ICE with UNSIGNED in match_sym_complex_part.Thomas Koenig2-0/+9
gcc/fortran/ChangeLog: PR fortran/117225 * primary.cc (match_sym_complex_part): An UNSIGNED in a complex part is an error. gcc/testsuite/ChangeLog: PR fortran/117225 * gfortran.dg/unsigned_38.f90: New test.
2024-10-18runtime/testdata: fix for C23 nullptr keywordIan Lance Taylor1-1/+1
Backport https://go.dev/cl/620955 from main repo. Original description: src/runtime/testdata/testprogcgo/threadprof.go contains C code with a variable called nullptr. This conflicts with the nullptr keyword in the C23 revision of the C standard (showing up as gccgo test build failures when updating GCC to use C23 by default when building C code). Rename that variable to nullpointer to avoid the clash with the keyword (any other name that's not a keyword would work just as well). Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/621059
2024-10-18diagnostics: remove forward decl of json::value from diagnostic.hDavid Malcolm1-1/+0
I believe this hasn't been necessary since r15-1413-gd3878c85f331c7. gcc/ChangeLog: * diagnostic.h (json::value): Remove forward decl. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-10-18diagnostics: add debug dump functionsDavid Malcolm11-14/+160
This commit expands on r15-3973-g4c7a58ac2617e2, which added debug "dump" member functiosn to pretty_printer and output_buffer. This followup adds "dump" member functions to diagnostic_context and diagnostic_format, extends the existing dump functions and adds indentation to make it much easier to see the various relationships between context, format, printer, etc. Hence you can now do: (gdb) call global_dc->dump () and get a useful summary of what the diagnostic subsystem is doing; for example: (gdb) call global_dc->dump() diagnostic_context: counts: output format: sarif_output_format printer: m_show_color: false m_url_format: bel m_buffer: m_formatted_obstack current object: length 0: m_chunk_obstack current object: length 0: pp_formatted_chunks: depth 0 0: TEXT("Function ")] 1: BEGIN_QUOTE, TEXT("program"), END_QUOTE] 2: TEXT(" requires an argument list at ")] 3: TEXT("(1)")] showing the counts of all diagnostic kind that are non-zero (none yet), that we have a sarif output format, and the printer is part-way through formatting a string. gcc/ChangeLog: * diagnostic-format-json.cc (json_output_format::dump): New. * diagnostic-format-sarif.cc (sarif_output_format::dump): New. (sarif_file_output_format::dump): New. * diagnostic-format-text.cc (diagnostic_text_output_format::dump): New. * diagnostic-format-text.h (diagnostic_text_output_format::dump): New decl. * diagnostic-format.h (diagnostic_output_format::dump): New decls. * diagnostic.cc (diagnostic_context::dump): New. (diagnostic_output_format::dump): New. * diagnostic.h (diagnostic_context::dump): New decls. * pretty-print-format-impl.h (pp_formatted_chunks::dump): Add "indent" param. * pretty-print.cc (bytes_per_hexdump_line): New constant. (print_hexdump_line): New. (print_hexdump): New. (output_buffer::dump): Add "indent" param and use it. Add hexdump of current object in m_formatted_obstack and m_chunk_obstack. (pp_formatted_chunks::dump): Add "indent" param and use it. (pretty_printer::dump): Likewise. Add dumping of m_show_color and m_url_format. * pretty-print.h (output_buffer::dump): Add "indent" param. (pretty_printer::dump): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic_plugin_xhtml_format.c (xhtml_output_format::dump): New. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-10-19c: Fix -std=gnu23 -Wtraditional for () in function definitionsJoseph Myers4-2/+29
We don't yet have clear agreement on removing -Wtraditional (although it seems there is little to no use for most of the warnings therein), so fix the bug in its interaction with -std=gnu23 to continue progress on making -std=gnu23 the default while -Wtraditional remains under discussion. The warning for ISO C function definitions with -Wtraditional properly covers (void), but also wrongly warned for () in C23 mode as that has the same semantics as (void) in that case. Keep track in c_arg_info of when () was converted to (void) for C23 so that -Wtraditional can avoid warning in that case (with an appropriate comment on the definition of the new field to make clear it can be removed along with -Wtraditional). Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/c/ * c-tree.h (c_arg_info): Add c23_empty_parens. * c-decl.cc (grokparms): Set c23_empty_parens. (build_arg_info): Clear c23_empty_parens. (store_parm_decls_newstyle): Do not give -Wtraditional warning for ISO C function definition if c23_empty_parens. gcc/testsuite/ * gcc.dg/wtr-gnu17-1.c, gcc.dg/wtr-gnu23-1.c: New tests.
2024-10-19Daily bump.GCC Administrator8-1/+2440
2024-10-18gcc/: Merge definitions of array_type_nelts_topAlejandro Colomar6-29/+14
There were two identical definitions, and none of them are available where they are needed for implementing a number-of-elements-of operator. Merge them, and provide the single definition in gcc/tree.{h,cc}, where it's available for that operator, which will be added in a following commit. gcc/ChangeLog: * tree.h (array_type_nelts_top) * tree.cc (array_type_nelts_top): Define function (moved from gcc/cp/). gcc/cp/ChangeLog: * cp-tree.h (array_type_nelts_top) * tree.cc (array_type_nelts_top): Remove function (move to gcc/). gcc/rust/ChangeLog: * backend/rust-tree.h (array_type_nelts_top) * backend/rust-tree.cc (array_type_nelts_top): Remove function. Signed-off-by: Alejandro Colomar <alx@kernel.org>
2024-10-18gcc/: Rename array_type_nelts => array_type_nelts_minus_oneAlejandro Colomar14-28/+30
The old name was misleading. While at it, also rename some temporary variables that are used with this function, for consistency. Link: <https://inbox.sourceware.org/gcc-patches/9fffd80-dca-2c7e-14b-6c9b509a7215@redhat.com/T/#m2f661c67c8f7b2c405c8c7fc3152dd85dc729120> gcc/ChangeLog: * tree.cc (array_type_nelts, array_type_nelts_minus_one) * tree.h (array_type_nelts, array_type_nelts_minus_one) * expr.cc (count_type_elements) * config/aarch64/aarch64.cc (pure_scalable_type_info::analyze_array) * config/i386/i386.cc (ix86_canonical_va_list_type): Rename array_type_nelts => array_type_nelts_minus_one The old name was misleading. gcc/c/ChangeLog: * c-decl.cc (one_element_array_type_p, get_parm_array_spec) * c-fold.cc (c_fold_array_ref): Rename array_type_nelts => array_type_nelts_minus_one gcc/cp/ChangeLog: * decl.cc (reshape_init_array) * init.cc (build_zero_init_1) (build_value_init_noctor) (build_vec_init) (build_delete) * lambda.cc (add_capture) * tree.cc (array_type_nelts_top): Rename array_type_nelts => array_type_nelts_minus_one gcc/fortran/ChangeLog: * trans-array.cc (structure_alloc_comps) * trans-openmp.cc (gfc_walk_alloc_comps) (gfc_omp_clause_linear_ctor): Rename array_type_nelts => array_type_nelts_minus_one gcc/rust/ChangeLog: * backend/rust-tree.cc (array_type_nelts_top): Rename array_type_nelts => array_type_nelts_minus_one Suggested-by: Richard Biener <richard.guenther@gmail.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
2024-10-18hppa: Fix up pa.opt.urlsJohn David Anglin1-0/+2
2024-10-18 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: * config/pa/pa.opt.urls: Fix for -mlra.
2024-10-18Handle GFC_STD_UNSIGNED like a standard in error messages.Thomas Koenig2-0/+6
gcc/fortran/ChangeLog: * error.cc (notify_std_msg): Handle GFC_STD_UNSIGNED. gcc/testsuite/ChangeLog: * gfortran.dg/unsigned_37.f90: New test.
2024-10-18hppa: Add LRA supportJohn David Anglin3-38/+66
LRA is not enabled as default since there are some new test fails remaining to resolve. 2024-10-18 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: PR target/113933 * config/pa/pa.cc (pa_use_lra_p): Declare. (TARGET_LRA_P): Change define to pa_use_lra_p. (pa_use_lra_p): New function. (legitimize_pic_address): Also check lra_in_progress. (pa_emit_move_sequence): Likewise. (pa_legitimate_constant_p): Likewise. (pa_legitimate_address_p): Likewise. (pa_secondary_reload): For floating-point loads and stores, return NO_REGS for REG and SUBREG operands. Return GENERAL_REGS for some shift register spills. * config/pa/pa.opt: Add mlra option. * config/pa/predicates.md (integer_store_memory_operand): Also check lra_in_progress. (floating_point_store_memory_operand): Likewise. (reg_before_reload_operand): Likewise.
2024-10-18[PATCH 3/7] RISC-V: Fix vector memcpy smaller LMUL generationCraig Blackmore4-5/+92
If riscv_vector::expand_block_move is generating a straight-line memcpy using a predicated store, it tries to use a smaller LMUL to reduce register pressure if it still allows an entire transfer. This happens in the inner loop of riscv_vector::expand_block_move, however, the vmode chosen by this loop gets overwritten later in the function, so I have added the missing break from the outer loop. I have also addressed a couple of issues with the conditions of the if statement within the inner loop. The first condition did not make sense to me: ``` TARGET_MIN_VLEN * lmul <= nunits * BITS_PER_UNIT ``` I think this was supposed to be checking that the length fits within the given LMUL, so I have changed it to do that. The second condition: ``` /* Avoid loosing the option of using vsetivli . */ && (nunits <= 31 * lmul || nunits > 31 * 8) ``` seems to imply that lmul affects the range of AVL immediate that vsetivli can take but I don't think that is correct. Anyway, I don't think this condition is necessary because if we find a suitable mode we should stick with it, regardless of whether it allowed vsetivli, rather than continuing to try larger lmul which would increase register pressure or smaller potential_ew which would increase AVL. I have removed this condition. gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Fix condition for using smaller LMUL. Break outer loop if a suitable vmode has been found. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr112929-1.c: Expect smaller lmul. * gcc.target/riscv/rvv/vsetvl/pr112988-1.c: Likewise. * gcc.target/riscv/rvv/base/cpymem-3.c: New test.
2024-10-18[PATCH 2/7] RISC-V: Fix uninitialized reg in memcpyCraig Blackmore1-2/+1
gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Replace `end` with `length_rtx` in gen_rtx_NE.
2024-10-18[PATCH 1/7] RISC-V: Fix indentation in riscv_vector::expand_block_move [NFC]Craig Blackmore1-16/+16
gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Fix indentation.
2024-10-18i386: Fix the order of operands in andn<MMXMODEI:mode>3 [PR117192]Uros Bizjak2-3/+19
Fix the order of operands in andn<MMXMODEI:mode>3 expander to comply with the specification, where bitwise-complement applies to operand 2. PR target/117192 gcc/ChangeLog: * config/i386/mmx.md (andn<MMXMODEI:mode>3): Swap operand indexes 1 and 2 to comply with andn specification. gcc/testsuite/ChangeLog: * gcc.target/i386/pr117192.c: New test.
2024-10-18SVE intrinsics: Add fold_active_lanes_to method to refactor svmul and svdiv.Jennifer Schmitz17-94/+387
As suggested in https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663275.html, this patch adds the method gimple_folder::fold_active_lanes_to (tree X). This method folds active lanes to X and sets inactive lanes according to the predication, returning a new gimple statement. That makes folding of SVE intrinsics easier and reduces code duplication in the svxxx_impl::fold implementations. Using this new method, svdiv_impl::fold and svmul_impl::fold were refactored. Additionally, the method was used for two optimizations: 1) Fold svdiv to the dividend, if the divisor is all ones and 2) for svmul, if one of the operands is all ones, fold to the other operand. Both optimizations were previously applied to _x and _m predication on the RTL level, but not for _z, where svdiv/svmul were still being used. For both optimization, codegen was improved by this patch, for example by skipping sel instructions with all-same operands and replacing sel instructions by mov instructions. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold): Refactor using fold_active_lanes_to and fold to dividend, is the divisor is all ones. (svmul_impl::fold): Refactor using fold_active_lanes_to and fold to the other operand, if one of the operands is all ones. * config/aarch64/aarch64-sve-builtins.h: Declare gimple_folder::fold_active_lanes_to (tree). * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::fold_actives_lanes_to): Add new method to fold actives lanes to given argument and setting inactives lanes according to the predication. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/div_s32.c: Adjust expected outcome. * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise. * gcc.target/aarch64/sve/fold_div_zero.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_s16.c: New test. * gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise. * gcc.target/aarch64/sve/mul_const_run.c: Likewise.
2024-10-18[5/n] remove trapv-*.c special-casing of gcc.dg/vect/ filesRichard Biener2-8/+4
The following makes -ftrapv explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named trapv-* * gcc.dg/vect/trapv-vect-reduc-4.c: Add dg-additional-options -ftrapv.
2024-10-18[4/n] remove wrapv-*.c special-casing of gcc.dg/vect/ filesRichard Biener6-14/+12
The following makes -fwrapv explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named wrapv-* * gcc.dg/vect/wrapv-vect-7.c: Add dg-additional-options -fwrapv. * gcc.dg/vect/wrapv-vect-reduc-2char.c: Likewise. * gcc.dg/vect/wrapv-vect-reduc-2short.c: Likewise. * gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c: Likewise. * gcc.dg/vect/wrapv-vect-reduc-pattern-2c.c: Likewise.
2024-10-18[3/n] remove fast-math-*.c special-casing of gcc.dg/vect/ filesRichard Biener53-22/+57
The following makes -ffast-math explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named fast-math-* * gcc.dg/vect/fast-math-bb-slp-call-1.c: Add dg-additional-options -ffast-math. * gcc.dg/vect/fast-math-bb-slp-call-2.c: Likewise. * gcc.dg/vect/fast-math-bb-slp-call-3.c: Likewise. * gcc.dg/vect/fast-math-ifcvt-1.c: Likewise. * gcc.dg/vect/fast-math-pr35982.c: Likewise. * gcc.dg/vect/fast-math-pr43074.c: Likewise. * gcc.dg/vect/fast-math-pr44152.c: Likewise. * gcc.dg/vect/fast-math-pr55281.c: Likewise. * gcc.dg/vect/fast-math-slp-27.c: Likewise. * gcc.dg/vect/fast-math-slp-38.c: Likewise. * gcc.dg/vect/fast-math-vect-call-1.c: Likewise. * gcc.dg/vect/fast-math-vect-call-2.c: Likewise. * gcc.dg/vect/fast-math-vect-complex-3.c: Likewise. * gcc.dg/vect/fast-math-vect-outer-7.c: Likewise. * gcc.dg/vect/fast-math-vect-pow-1.c: Likewise. * gcc.dg/vect/fast-math-vect-pow-2.c: Likewise. * gcc.dg/vect/fast-math-vect-pr25911.c: Likewise. * gcc.dg/vect/fast-math-vect-pr29925.c: Likewise. * gcc.dg/vect/fast-math-vect-reduc-5.c: Likewise. * gcc.dg/vect/fast-math-vect-reduc-7.c: Likewise. * gcc.dg/vect/fast-math-vect-reduc-8.c: Likewise. * gcc.dg/vect/fast-math-vect-reduc-9.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise. * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mla-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mls-half-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-double.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-float.c: Likewise. * gcc.dg/vect/complex/fast-math-complex-mul-half-float.c: Likewise.
2024-10-18[2/n] remove no-vfa-*.c special-casing of gcc.dg/vect/ filesRichard Biener18-9/+20
The following makes --param vect-max-version-for-alias-checks=0 explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named no-vfa-* * gcc.dg/vect/no-vfa-pr29145.c: Add dg-additional-options --param vect-max-version-for-alias-checks=0. * gcc.dg/vect/no-vfa-vect-101.c: Likewise. * gcc.dg/vect/no-vfa-vect-102.c: Likewise. * gcc.dg/vect/no-vfa-vect-102a.c: Likewise. * gcc.dg/vect/no-vfa-vect-37.c: Likewise. * gcc.dg/vect/no-vfa-vect-43.c: Likewise. * gcc.dg/vect/no-vfa-vect-45.c: Likewise. * gcc.dg/vect/no-vfa-vect-49.c: Likewise. * gcc.dg/vect/no-vfa-vect-51.c: Likewise. * gcc.dg/vect/no-vfa-vect-53.c: Likewise. * gcc.dg/vect/no-vfa-vect-57.c: Likewise. * gcc.dg/vect/no-vfa-vect-61.c: Likewise. * gcc.dg/vect/no-vfa-vect-79.c: Likewise. * gcc.dg/vect/no-vfa-vect-depend-1.c: Likewise. * gcc.dg/vect/no-vfa-vect-depend-2.c: Likewise. * gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise. * gcc.dg/vect/no-vfa-vect-dv-2.c: Likewise.
2024-10-18Adjust assert in vect_build_slp_tree_2Richard Biener1-5/+1
The assert in SLP discovery when we handle masked operations is confusingly wide - all gather variants should be catched by the earlier STMT_VINFO_GATHER_SCATTER_P. * tree-vect-slp.cc (vect_build_slp_tree_2): Only expect IFN_MASK_LOAD for masked loads that are not STMT_VINFO_GATHER_SCATTER_P.
2024-10-18testsuite: Add necessary dejagnu directives to pr115815_0.cMartin Jambor1-0/+4
I have received an email from the Linaro infrastructure that the test gcc.dg/lto/pr115815_0.c which I added is failing on arm-eabi and I realized that not only it is missing dg-require-effective-target global_constructor but actually any dejagnu directives at all, which means it is unnecessarily running both at -O0 and -O2 and there is an unnecesary run test too. All fixed by this patch. I have not actually verified that the failure goes away on arm-eabi but have very high hopes it will. I have verified that the test still checks for the bug and also that it passes by running: make -k check-gcc RUNTESTFLAGS="lto.exp=*pr115815*" gcc/testsuite/ChangeLog: 2024-10-14 Martin Jambor <mjambor@suse.cz> * gcc.dg/lto/pr115815_0.c: Add dejagu directives.
2024-10-18middle-end: Fix GSI for gcond root [PR117140]Tamar Christina2-1/+95
When finding the gsi to use for code of the root statements we should use the one of the original statement rather than the gcond which may be inside a pattern. Without this the emitted instructions may be discarded later. gcc/ChangeLog: PR tree-optimization/117140 * tree-vect-slp.cc (vectorize_slp_instance_root_stmt): Use gsi from original statement. gcc/testsuite/ChangeLog: PR tree-optimization/117140 * gcc.dg/vect/vect-early-break_129-pr117140.c: New test.
2024-10-18middle-end: Fix VEC_PERM_EXPR lowering since relaxation of vector sizesTamar Christina2-3/+20
In GCC 14 VEC_PERM_EXPR was relaxed to be able to permute to a 2x larger vector than the size of the input vectors. However various passes and transformations were not updated to account for this. I have patches in these area that I will be upstreaming with individual patches that expose them. This one is that vectlower tries to lower based on the size of the input vectors rather than the size of the output. As a consequence it creates an invalid vector of half the size. Luckily we ICE because the resulting nunits doesn't match the vector size. gcc/ChangeLog: * tree-vect-generic.cc (lower_vec_perm): Use output vector size instead of input vector when determining output nunits. gcc/testsuite/ChangeLog: * gcc.dg/vec-perm-lower.c: New test.
2024-10-18AArch64: use movi d0, #0 to clear SVE registers instead of mov z0.d, #0Tamar Christina1-2/+5
This patch changes SVE to use Adv. SIMD movi 0 to clear SVE registers when not in SVE streaming mode. As the Neoverse Software Optimization guides indicate SVE mov #0 is not a zero cost move. When In streaming mode we continue to use SVE's mov to clear the registers. Tests have already been updated. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_output_sve_mov_immediate): Use fmov for SVE zeros.
2024-10-18AArch64: support encoding integer immediates using floating point movesTamar Christina2-128/+241
This patch extends our immediate SIMD generation cases to support generating integer immediates using floating point operation if the integer immediate maps to an exact FP value. As an example: uint32x4_t f1() { return vdupq_n_u32(0x3f800000); } currently generates: f1: adrp x0, .LC0 ldr q0, [x0, #:lo12:.LC0] ret i.e. a load, but with this change: f1: fmov v0.4s, 1.0e+0 ret Such immediates are common in e.g. our Math routines in glibc because they are created to extract or mark part of an FP immediate as masks. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_sve_valid_immediate, aarch64_simd_valid_immediate): Refactor accepting modes and values. (aarch64_float_const_representable_p): Refactor and extract FP checks into ... (aarch64_real_float_const_representable_p): ...This and fix fail fallback from real_to_integer. (aarch64_advsimd_valid_immediate): Use it. gcc/testsuite/ChangeLog: * gcc.target/aarch64/const_create_using_fmov.c: New test.
2024-10-18AArch64: update testsuite to account for new zero movesTamar Christina52-412/+410
The patch series will adjust how zeros are created. In principal it doesn't matter the exact lane size a zero gets created on but this makes the tests a bit fragile. This preparation patch will update the testsuite to accept multiple variants of ways to create vector zeros to accept both the current syntax and the one being transitioned to in the series. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldp_stp_18.c: Update zero regexpr. * gcc.target/aarch64/memset-corner-cases.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_bf16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_f16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_f32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_f64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_s16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_s32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_s64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_s8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_u16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_u32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_u64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/revd_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acge_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acge_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acge_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acgt_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acgt_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acgt_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acle_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acle_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/acle_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/aclt_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/aclt_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/aclt_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bic_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bic_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/cmpuo_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/cmpuo_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/cmpuo_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_u8.c: Likewise. * gcc.target/aarch64/sve/const_fold_div_1.c: Likewise. * gcc.target/aarch64/sve/const_fold_mul_1.c: Likewise. * gcc.target/aarch64/sve/dup_imm_1.c: Likewise. * gcc.target/aarch64/sve/fdup_1.c: Likewise. * gcc.target/aarch64/sve/fold_div_zero.c: Likewise. * gcc.target/aarch64/sve/fold_mul_zero.c: Likewise. * gcc.target/aarch64/sve/pcs/args_2.c: Likewise. * gcc.target/aarch64/sve/pcs/args_3.c: Likewise. * gcc.target/aarch64/sve/pcs/args_4.c: Likewise. * gcc.target/aarch64/vect-fmovd-zero.c: Likewise.
2024-10-18arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpersChristophe Lyon1-46/+68
In several places we are looking for a type twice or half as large as the type suffix: this patch introduces helper functions to avoid code duplication. long_type_suffix is similar to the SVE counterpart, but adds an 'expected_tclass' parameter. half_type_suffix is similar to it, but does not exist in SVE. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (long_type_suffix): New. (half_type_suffix): New. (struct binary_move_narrow_def): Use new helper. (struct binary_move_narrow_unsigned_def): Likewise. (struct binary_rshift_narrow_def): Likewise. (struct binary_rshift_narrow_unsigned_def): Likewise. (struct binary_widen_def): Likewise. (struct binary_widen_n_def): Likewise. (struct binary_widen_opt_n_def): Likewise. (struct unary_widen_def): Likewise.
2024-10-18arm: [MVE intrinsics] rework vsbcq vsbciqChristophe Lyon4-188/+42
Implement vsbcq vsbciq using the new MVE builtins framework. We re-use most of the code introduced by the previous patches. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): Add support for vsbciq and vsbcq. (vadciq, vadcq): Add new parameter. (vsbciq): New. (vsbcq): New. * config/arm/arm-mve-builtins-base.def (vsbciq): New. (vsbcq): New. * config/arm/arm-mve-builtins-base.h (vsbciq): New. (vsbcq): New. * config/arm/arm_mve.h (vsbciq): Delete. (vsbciq_m): Delete. (vsbcq): Delete. (vsbcq_m): Delete. (vsbciq_s32): Delete. (vsbciq_u32): Delete. (vsbciq_m_s32): Delete. (vsbciq_m_u32): Delete. (vsbcq_s32): Delete. (vsbcq_u32): Delete. (vsbcq_m_s32): Delete. (vsbcq_m_u32): Delete. (__arm_vsbciq_s32): Delete. (__arm_vsbciq_u32): Delete. (__arm_vsbciq_m_s32): Delete. (__arm_vsbciq_m_u32): Delete. (__arm_vsbcq_s32): Delete. (__arm_vsbcq_u32): Delete. (__arm_vsbcq_m_s32): Delete. (__arm_vsbcq_m_u32): Delete. (__arm_vsbciq): Delete. (__arm_vsbciq_m): Delete. (__arm_vsbcq): Delete. (__arm_vsbcq_m): Delete.
2024-10-18arm: [MVE intrinsics] rework vadcqChristophe Lyon4-94/+56
Implement vadcq using the new MVE builtins framework. We re-use most of the code introduced by the previous patch to support vadciq: we just need to initialize carry from the input parameter. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (vadcq_vsbc): Add support for vadcq. * config/arm/arm-mve-builtins-base.def (vadcq): New. * config/arm/arm-mve-builtins-base.h (vadcq): New. * config/arm/arm_mve.h (vadcq): Delete. (vadcq_m): Delete. (vadcq_s32): Delete. (vadcq_u32): Delete. (vadcq_m_s32): Delete. (vadcq_m_u32): Delete. (__arm_vadcq_s32): Delete. (__arm_vadcq_u32): Delete. (__arm_vadcq_m_s32): Delete. (__arm_vadcq_m_u32): Delete. (__arm_vadcq): Delete. (__arm_vadcq_m): Delete.
2024-10-18arm: [MVE intrinsics] rework vadciqChristophe Lyon4-89/+95
Implement vadciq using the new MVE builtins framework. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): New. (vadciq): New. * config/arm/arm-mve-builtins-base.def (vadciq): New. * config/arm/arm-mve-builtins-base.h (vadciq): New. * config/arm/arm_mve.h (vadciq): Delete. (vadciq_m): Delete. (vadciq_s32): Delete. (vadciq_u32): Delete. (vadciq_m_s32): Delete. (vadciq_m_u32): Delete. (__arm_vadciq_s32): Delete. (__arm_vadciq_u32): Delete. (__arm_vadciq_m_s32): Delete. (__arm_vadciq_m_u32): Delete. (__arm_vadciq): Delete. (__arm_vadciq_m): Delete.
2024-10-18arm: [MVE intrinsics] factorize vadc vadci vsbc vsbciChristophe Lyon2-109/+42
Factorize vadc/vsbc and vadci/vsbci so that they use the same parameterized names. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (mve_insn): Add VADCIQ_M_S, VADCIQ_M_U, VADCIQ_U, VADCIQ_S, VADCQ_M_S, VADCQ_M_U, VADCQ_S, VADCQ_U, VSBCIQ_M_S, VSBCIQ_M_U, VSBCIQ_S, VSBCIQ_U, VSBCQ_M_S, VSBCQ_M_U, VSBCQ_S, VSBCQ_U. (VADCIQ, VSBCIQ): Merge into ... (VxCIQ): ... this. (VADCIQ_M, VSBCIQ_M): Merge into ... (VxCIQ_M): ... this. (VSBCQ, VADCQ): Merge into ... (VxCQ): ... this. (VSBCQ_M, VADCQ_M): Merge into ... (VxCQ_M): ... this. * config/arm/mve.md (mve_vadciq_<supf>v4si, mve_vsbciq_<supf>v4si): Merge into ... (@mve_<mve_insn>q_<supf>v4si): ... this. (mve_vadciq_m_<supf>v4si, mve_vsbciq_m_<supf>v4si): Merge into ... (@mve_<mve_insn>q_m_<supf>v4si): ... this. (mve_vadcq_<supf>v4si, mve_vsbcq_<supf>v4si): Merge into ... (@mve_<mve_insn>q_<supf>v4si): ... this. (mve_vadcq_m_<supf>v4si, mve_vsbcq_m_<supf>v4si): Merge into ... (@mve_<mve_insn>q_m_<supf>v4si): ... this.
2024-10-18arm: [MVE intrinsics] add vadc_vsbc shapeChristophe Lyon2-0/+37
This patch adds the vadc_vsbc shape description. 2024-08-28 Christophe Lyon <chrirstophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (vadc_vsbc): New. * config/arm/arm-mve-builtins-shapes.h (vadc_vsbc): New.
2024-10-18arm: [MVE intrinsics] remove vshlcq useless expandersChristophe Lyon3-81/+0
Since we rewrote the implementation of vshlcq intrinsics, we no longer need these expanders. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-builtins.cc (arm_ternop_unone_none_unone_imm_qualifiers) (-arm_ternop_none_none_unone_imm_qualifiers): Delete. * config/arm/arm_mve_builtins.def (vshlcq_m_vec_s) (vshlcq_m_carry_s, vshlcq_m_vec_u, vshlcq_m_carry_u): Delete. * config/arm/mve.md (mve_vshlcq_vec_<supf><mode>): Delete. (mve_vshlcq_carry_<supf><mode>): Delete. (mve_vshlcq_m_vec_<supf><mode>): Delete. (mve_vshlcq_m_carry_<supf><mode>): Delete.
2024-10-18arm: [MVE intrinsics] rework vshlcqChristophe Lyon6-235/+77
Implement vshlc using the new MVE builtins framework. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (class vshlc_impl): New. (vshlc): New. * config/arm/arm-mve-builtins-base.def (vshlcq): New. * config/arm/arm-mve-builtins-base.h (vshlcq): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vshlc. * config/arm/arm_mve.h (vshlcq): Delete. (vshlcq_m): Delete. (vshlcq_s8): Delete. (vshlcq_u8): Delete. (vshlcq_s16): Delete. (vshlcq_u16): Delete. (vshlcq_s32): Delete. (vshlcq_u32): Delete. (vshlcq_m_s8): Delete. (vshlcq_m_u8): Delete. (vshlcq_m_s16): Delete. (vshlcq_m_u16): Delete. (vshlcq_m_s32): Delete. (vshlcq_m_u32): Delete. (__arm_vshlcq_s8): Delete. (__arm_vshlcq_u8): Delete. (__arm_vshlcq_s16): Delete. (__arm_vshlcq_u16): Delete. (__arm_vshlcq_s32): Delete. (__arm_vshlcq_u32): Delete. (__arm_vshlcq_m_s8): Delete. (__arm_vshlcq_m_u8): Delete. (__arm_vshlcq_m_s16): Delete. (__arm_vshlcq_m_u16): Delete. (__arm_vshlcq_m_s32): Delete. (__arm_vshlcq_m_u32): Delete. (__arm_vshlcq): Delete. (__arm_vshlcq_m): Delete. * config/arm/mve.md (mve_vshlcq_<supf><mode>): Add '@' prefix. (mve_vshlcq_m_<supf><mode>): Likewise.
2024-10-18arm: [MVE intrinsics] add vshlc shapeChristophe Lyon2-0/+45
This patch adds the vshlc shape description. 2024-08-28 Christophe Lyon <chrirstophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (vshlc): New. * config/arm/arm-mve-builtins-shapes.h (vshlc): New.
2024-10-18arm: [MVE intrinsics] remove useless v[id]wdup expandersChristophe Lyon3-90/+0
Like with vddup/vidup, we use code_for_mve_q_wb_u_insn, so we can drop the expanders and their declarations as builtins, now useless. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-builtins.cc (arm_quinop_unone_unone_unone_unone_imm_pred_qualifiers): Delete. * config/arm/arm_mve_builtins.def (viwdupq_wb_u, vdwdupq_wb_u) (viwdupq_m_wb_u, vdwdupq_m_wb_u, viwdupq_m_n_u, vdwdupq_m_n_u) (vdwdupq_n_u, viwdupq_n_u): Delete. * config/arm/mve.md (mve_vdwdupq_n_u<mode>): Delete. (mve_vdwdupq_wb_u<mode>): Delete. (mve_vdwdupq_m_n_u<mode>): Delete. (mve_vdwdupq_m_wb_u<mode>): Delete.
2024-10-18arm: [MVE intrinsics] update v[id]wdup testsChristophe Lyon18-54/+54
Testing v[id]wdup overloads with '1' as argument for uint32_t* does not make sense: this patch adds a new 'unit32_t *a' parameter to foo2 in such tests. The difference with v[id]dup tests (where we removed 'foo2') is that in 'foo1' we test the overload with a variable 'wrap' parameter (b) and we need foo2 to test the overload with an immediate (1). 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ * gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c: Use pointer parameter in foo2. * gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c: Likewise.
2024-10-18arm: [MVE intrinsics] rework vdwdup viwdupChristophe Lyon5-737/+53
Implement vdwdup and viwdup using the new MVE builtins framework. In order to share more code with viddup_impl, the patch swaps operands 1 and 2 in @mve_v[id]wdupq_m_wb_u<mode>_insn, so that the parameter order is similar to what @mve_v[id]dupq_m_wb_u<mode>_insn uses. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (viddup_impl): Add support for wrapping versions. (vdwdupq): New. (viwdupq): New. * config/arm/arm-mve-builtins-base.def (vdwdupq): New. (viwdupq): New. * config/arm/arm-mve-builtins-base.h (vdwdupq): New. (viwdupq): New. * config/arm/arm_mve.h (vdwdupq_m): Delete. (vdwdupq_u8): Delete. (vdwdupq_u32): Delete. (vdwdupq_u16): Delete. (viwdupq_m): Delete. (viwdupq_u8): Delete. (viwdupq_u32): Delete. (viwdupq_u16): Delete. (vdwdupq_x_u8): Delete. (vdwdupq_x_u16): Delete. (vdwdupq_x_u32): Delete. (viwdupq_x_u8): Delete. (viwdupq_x_u16): Delete. (viwdupq_x_u32): Delete. (vdwdupq_m_n_u8): Delete. (vdwdupq_m_n_u32): Delete. (vdwdupq_m_n_u16): Delete. (vdwdupq_m_wb_u8): Delete. (vdwdupq_m_wb_u32): Delete. (vdwdupq_m_wb_u16): Delete. (vdwdupq_n_u8): Delete. (vdwdupq_n_u32): Delete. (vdwdupq_n_u16): Delete. (vdwdupq_wb_u8): Delete. (vdwdupq_wb_u32): Delete. (vdwdupq_wb_u16): Delete. (viwdupq_m_n_u8): Delete. (viwdupq_m_n_u32): Delete. (viwdupq_m_n_u16): Delete. (viwdupq_m_wb_u8): Delete. (viwdupq_m_wb_u32): Delete. (viwdupq_m_wb_u16): Delete. (viwdupq_n_u8): Delete. (viwdupq_n_u32): Delete. (viwdupq_n_u16): Delete. (viwdupq_wb_u8): Delete. (viwdupq_wb_u32): Delete. (viwdupq_wb_u16): Delete. (vdwdupq_x_n_u8): Delete. (vdwdupq_x_n_u16): Delete. (vdwdupq_x_n_u32): Delete. (vdwdupq_x_wb_u8): Delete. (vdwdupq_x_wb_u16): Delete. (vdwdupq_x_wb_u32): Delete. (viwdupq_x_n_u8): Delete. (viwdupq_x_n_u16): Delete. (viwdupq_x_n_u32): Delete. (viwdupq_x_wb_u8): Delete. (viwdupq_x_wb_u16): Delete. (viwdupq_x_wb_u32): Delete. (__arm_vdwdupq_m_n_u8): Delete. (__arm_vdwdupq_m_n_u32): Delete. (__arm_vdwdupq_m_n_u16): Delete. (__arm_vdwdupq_m_wb_u8): Delete. (__arm_vdwdupq_m_wb_u32): Delete. (__arm_vdwdupq_m_wb_u16): Delete. (__arm_vdwdupq_n_u8): Delete. (__arm_vdwdupq_n_u32): Delete. (__arm_vdwdupq_n_u16): Delete. (__arm_vdwdupq_wb_u8): Delete. (__arm_vdwdupq_wb_u32): Delete. (__arm_vdwdupq_wb_u16): Delete. (__arm_viwdupq_m_n_u8): Delete. (__arm_viwdupq_m_n_u32): Delete. (__arm_viwdupq_m_n_u16): Delete. (__arm_viwdupq_m_wb_u8): Delete. (__arm_viwdupq_m_wb_u32): Delete. (__arm_viwdupq_m_wb_u16): Delete. (__arm_viwdupq_n_u8): Delete. (__arm_viwdupq_n_u32): Delete. (__arm_viwdupq_n_u16): Delete. (__arm_viwdupq_wb_u8): Delete. (__arm_viwdupq_wb_u32): Delete. (__arm_viwdupq_wb_u16): Delete. (__arm_vdwdupq_x_n_u8): Delete. (__arm_vdwdupq_x_n_u16): Delete. (__arm_vdwdupq_x_n_u32): Delete. (__arm_vdwdupq_x_wb_u8): Delete. (__arm_vdwdupq_x_wb_u16): Delete. (__arm_vdwdupq_x_wb_u32): Delete. (__arm_viwdupq_x_n_u8): Delete. (__arm_viwdupq_x_n_u16): Delete. (__arm_viwdupq_x_n_u32): Delete. (__arm_viwdupq_x_wb_u8): Delete. (__arm_viwdupq_x_wb_u16): Delete. (__arm_viwdupq_x_wb_u32): Delete. (__arm_vdwdupq_m): Delete. (__arm_vdwdupq_u8): Delete. (__arm_vdwdupq_u32): Delete. (__arm_vdwdupq_u16): Delete. (__arm_viwdupq_m): Delete. (__arm_viwdupq_u8): Delete. (__arm_viwdupq_u32): Delete. (__arm_viwdupq_u16): Delete. (__arm_vdwdupq_x_u8): Delete. (__arm_vdwdupq_x_u16): Delete. (__arm_vdwdupq_x_u32): Delete. (__arm_viwdupq_x_u8): Delete. (__arm_viwdupq_x_u16): Delete. (__arm_viwdupq_x_u32): Delete. * config/arm/mve.md (@mve_<mve_insn>q_m_wb_u<mode>_insn): Swap operands 1 and 2.
2024-10-18arm: [MVE intrinsics] add vidwdup shapeChristophe Lyon2-0/+89
This patch adds the vidwdup shape description for vdwdup and viwdup. It is very similar to viddup, but accounts for the additional 'wrap' scalar parameter. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (vidwdup): New. * config/arm/arm-mve-builtins-shapes.h (vidwdup): New.
2024-10-18arm: [MVE intrinsics] factorize vdwdup viwdupChristophe Lyon2-55/+17
Factorize vdwdup and viwdup so that they use the same parameterized names. Like with vddup and vidup, we do not bother with the corresponding expanders, as we stop using them in a subsequent patch. The patch also adds the missing attributes to vdwdupq_wb_u_insn and viwdupq_wb_u_insn patterns. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (mve_insn): Add VIWDUPQ, VDWDUPQ, VIWDUPQ_M, VDWDUPQ_M. (VIDWDUPQ): New iterator. (VIDWDUPQ_M): New iterator. * config/arm/mve.md (mve_vdwdupq_wb_u<mode>_insn) (mve_viwdupq_wb_u<mode>_insn): Merge into ... (@mve_<mve_insn>q_wb_u<mode>_insn): ... this. Add missing mve_unpredicated_insn and mve_move attributes. (mve_vdwdupq_m_wb_u<mode>_insn, mve_viwdupq_m_wb_u<mode>_insn): Merge into ... (@mve_<mve_insn>q_m_wb_u<mode>_insn): ... this.
2024-10-18arm: [MVE intrinsics] fix checks of immediate argumentsChristophe Lyon1-16/+31
As discussed in [1], it is better to use "su64" for immediates in intrinsics signatures in order to provide better diagnostics (erroneous constants are not truncated for instance). This patch thus uses su64 instead of ss32 in binary_lshift_unsigned, binary_rshift_narrow, binary_rshift_narrow_unsigned, ternary_lshift, ternary_rshift. In addition, we fix cases where we called require_integer_immediate whereas we just want to check that the argument is a scalar, and thus use require_scalar_type in binary_acca_int32, binary_acca_int64, unary_int32_acc. Finally, in binary_lshift_unsigned we just want to check that 'imm' is an immediate, not the optional predicates. [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660262.html 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_acca_int32): Fix check of scalar argument. (binary_acca_int64): Likewise. (binary_lshift_unsigned): Likewise. (binary_rshift_narrow): Likewise. (binary_rshift_narrow_unsigned): Likewise. (ternary_lshift): Likewise. (ternary_rshift): Likewise. (unary_int32_acc): Likewise.
2024-10-18arm: [MVE intrinsics] remove v[id]dup expandersChristophe Lyon2-77/+0
We use code_for_mve_q_u_insn, rather than the expanders used by the previous implementation, so we can remove the expanders and their declaration as builtins. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm_mve_builtins.def (vddupq_n_u, vidupq_n_u) (vddupq_m_n_u, vidupq_m_n_u): Delete. * config/arm/mve.md (mve_vidupq_n_u<mode>, mve_vidupq_m_n_u<mode>) (mve_vddupq_n_u<mode>, mve_vddupq_m_n_u<mode>): Delete.
2024-10-18arm: [MVE intrinsics] update v[id]dup testsChristophe Lyon18-282/+18
Testing v[id]dup overloads with '1' as argument for uint32_t* does not make sense: instead of choosing the '_wb' overload, we choose the '_n', but we already do that in the '_n' tests. This patch removes all such bogus foo2 functions. 2024-08-28 Christophe Lyon <christophe.lyon@linaro.org> gcc/testsuite/ * gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_x_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_x_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vddupq_x_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_x_wb_u16.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_x_wb_u32.c: Remove foo2. * gcc.target/arm/mve/intrinsics/vidupq_x_wb_u8.c: Remove foo2.
2024-10-18arm: [MVE intrinsics] rework vddup vidupChristophe Lyon4-676/+116
Implement vddup and vidup using the new MVE builtins framework. We generate better code because we take advantage of the two outputs produced by the v[id]dup instructions. For instance, before: ldr r3, [r0] sub r2, r3, #8 str r2, [r0] mov r2, r3 vddup.u16 q3, r2, #1 now: ldr r2, [r0] vddup.u16 q3, r2, #1 str r2, [r0] 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-base.cc (class viddup_impl): New. (vddup): New. (vidup): New. * config/arm/arm-mve-builtins-base.def (vddupq): New. (vidupq): New. * config/arm/arm-mve-builtins-base.h (vddupq): New. (vidupq): New. * config/arm/arm_mve.h (vddupq_m): Delete. (vddupq_u8): Delete. (vddupq_u32): Delete. (vddupq_u16): Delete. (vidupq_m): Delete. (vidupq_u8): Delete. (vidupq_u32): Delete. (vidupq_u16): Delete. (vddupq_x_u8): Delete. (vddupq_x_u16): Delete. (vddupq_x_u32): Delete. (vidupq_x_u8): Delete. (vidupq_x_u16): Delete. (vidupq_x_u32): Delete. (vddupq_m_n_u8): Delete. (vddupq_m_n_u32): Delete. (vddupq_m_n_u16): Delete. (vddupq_m_wb_u8): Delete. (vddupq_m_wb_u16): Delete. (vddupq_m_wb_u32): Delete. (vddupq_n_u8): Delete. (vddupq_n_u32): Delete. (vddupq_n_u16): Delete. (vddupq_wb_u8): Delete. (vddupq_wb_u16): Delete. (vddupq_wb_u32): Delete. (vidupq_m_n_u8): Delete. (vidupq_m_n_u32): Delete. (vidupq_m_n_u16): Delete. (vidupq_m_wb_u8): Delete. (vidupq_m_wb_u16): Delete. (vidupq_m_wb_u32): Delete. (vidupq_n_u8): Delete. (vidupq_n_u32): Delete. (vidupq_n_u16): Delete. (vidupq_wb_u8): Delete. (vidupq_wb_u16): Delete. (vidupq_wb_u32): Delete. (vddupq_x_n_u8): Delete. (vddupq_x_n_u16): Delete. (vddupq_x_n_u32): Delete. (vddupq_x_wb_u8): Delete. (vddupq_x_wb_u16): Delete. (vddupq_x_wb_u32): Delete. (vidupq_x_n_u8): Delete. (vidupq_x_n_u16): Delete. (vidupq_x_n_u32): Delete. (vidupq_x_wb_u8): Delete. (vidupq_x_wb_u16): Delete. (vidupq_x_wb_u32): Delete. (__arm_vddupq_m_n_u8): Delete. (__arm_vddupq_m_n_u32): Delete. (__arm_vddupq_m_n_u16): Delete. (__arm_vddupq_m_wb_u8): Delete. (__arm_vddupq_m_wb_u16): Delete. (__arm_vddupq_m_wb_u32): Delete. (__arm_vddupq_n_u8): Delete. (__arm_vddupq_n_u32): Delete. (__arm_vddupq_n_u16): Delete. (__arm_vidupq_m_n_u8): Delete. (__arm_vidupq_m_n_u32): Delete. (__arm_vidupq_m_n_u16): Delete. (__arm_vidupq_n_u8): Delete. (__arm_vidupq_m_wb_u8): Delete. (__arm_vidupq_m_wb_u16): Delete. (__arm_vidupq_m_wb_u32): Delete. (__arm_vidupq_n_u32): Delete. (__arm_vidupq_n_u16): Delete. (__arm_vidupq_wb_u8): Delete. (__arm_vidupq_wb_u16): Delete. (__arm_vidupq_wb_u32): Delete. (__arm_vddupq_wb_u8): Delete. (__arm_vddupq_wb_u16): Delete. (__arm_vddupq_wb_u32): Delete. (__arm_vddupq_x_n_u8): Delete. (__arm_vddupq_x_n_u16): Delete. (__arm_vddupq_x_n_u32): Delete. (__arm_vddupq_x_wb_u8): Delete. (__arm_vddupq_x_wb_u16): Delete. (__arm_vddupq_x_wb_u32): Delete. (__arm_vidupq_x_n_u8): Delete. (__arm_vidupq_x_n_u16): Delete. (__arm_vidupq_x_n_u32): Delete. (__arm_vidupq_x_wb_u8): Delete. (__arm_vidupq_x_wb_u16): Delete. (__arm_vidupq_x_wb_u32): Delete. (__arm_vddupq_m): Delete. (__arm_vddupq_u8): Delete. (__arm_vddupq_u32): Delete. (__arm_vddupq_u16): Delete. (__arm_vidupq_m): Delete. (__arm_vidupq_u8): Delete. (__arm_vidupq_u32): Delete. (__arm_vidupq_u16): Delete. (__arm_vddupq_x_u8): Delete. (__arm_vddupq_x_u16): Delete. (__arm_vddupq_x_u32): Delete. (__arm_vidupq_x_u8): Delete. (__arm_vidupq_x_u16): Delete. (__arm_vidupq_x_u32): Delete.
2024-10-18arm: [MVE intrinsics] add viddup shapeChristophe Lyon5-0/+133
This patch adds the viddup shape description for vidup and vddup. This requires the addition of report_not_one_of and function_checker::require_immediate_one_of to gcc/config/arm/arm-mve-builtins.cc (they are copies of the aarch64 SVE counterpart). This patch also introduces MODE_wb. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/arm-mve-builtins-shapes.cc (viddup): New. * config/arm/arm-mve-builtins-shapes.h (viddup): New. * config/arm/arm-mve-builtins.cc (report_not_one_of): New. (function_checker::require_immediate_one_of): New. * config/arm/arm-mve-builtins.def (wb): New mode. * config/arm/arm-mve-builtins.h (function_checker) Add require_immediate_one_of.
2024-10-18arm: [MVE intrinsics] factorize vddup vidupChristophe Lyon2-45/+20
Factorize vddup and vidup so that they use the same parameterized names. This patch updates only the (define_insn "@mve_<mve_insn>q_u<mode>_insn") patterns and does not bother with the (define_expand "mve_vidupq_n_u<mode>") ones, because a subsequent patch avoids using them. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (mve_insn): Add VIDUPQ, VDDUPQ, VIDUPQ_M, VDDUPQ_M. (viddupq_op): New. (viddupq_m_op): New. (VIDDUPQ): New. (VIDDUPQ_M): New. * config/arm/mve.md (mve_vddupq_u<mode>_insn) (mve_vidupq_u<mode>_insn): Merge into ... (mve_<mve_insn>q_u<mode>_insn): ... this. (mve_vddupq_m_wb_u<mode>_insn, mve_vidupq_m_wb_u<mode>_insn): Merge into ... (mve_<mve_insn>q_m_wb_u<mode>_insn): ... this.
2024-10-18arm: [MVE intrinsics] rework vctpChristophe Lyon8-66/+79
Implement vctp using the new MVE builtins framework. 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (class vctpq_impl): New. (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-builtins-base.def (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-builtins-base.h (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-builtins-shapes.cc (vctp): New. * config/arm/arm-mve-builtins-shapes.h (vctp): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Add support for vctp. * config/arm/arm_mve.h (vctp16q): Delete. (vctp32q): Delete. (vctp64q): Delete. (vctp8q): Delete. (vctp8q_m): Delete. (vctp64q_m): Delete. (vctp32q_m): Delete. (vctp16q_m): Delete. (__arm_vctp16q): Delete. (__arm_vctp32q): Delete. (__arm_vctp64q): Delete. (__arm_vctp8q): Delete. (__arm_vctp8q_m): Delete. (__arm_vctp64q_m): Delete. (__arm_vctp32q_m): Delete. (__arm_vctp16q_m): Delete. * config/arm/mve.md (mve_vctp<MVE_vctp>q<MVE_vpred>): Add '@' prefix. (mve_vctp<MVE_vctp>q_m<MVE_vpred>): Likewise.