aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-08-27MIPS: Include missing mips16.S in libgcc/lib1funcs.SYunQiang Su1-1/+1
mips16.S was missing since commit 29b74545531f6afbee9fc38c267524326dbfbedf Date: Thu Jun 1 10:14:24 2023 +0800 MIPS: Add speculation_barrier support Without mips16.S included, some symbols will miss for mips16, and so some software will fail to build. libgcc/ChangeLog: * config/mips/lib1funcs.S: Includes mips16.S.
2024-08-27combine.cc (make_more_copies): Copy attributes from the original pseudo, ↵Hans-Peter Nilsson1-0/+6
PR115883 The first of the late-combine passes, propagates some of the copies made during the (in-time-)combine pass in make_more_copies into the users of the "original" pseudo registers and removes the "old" pseudos. That effectively removes attributes such as REG_POINTER, which matter to LRA. The quoted PR is for an ICE-manifesting bug that was exposed by the late-combine pass and went back to hiding with this patch until commit r15-2937-g3673b7054ec2, the fix for PR116236, when it was actually fixed. To wit, this patch is only incidentally related to that bug. In other words, the REG_POINTER attribute should not be required for LRA to work correctly. This patch merely corrects state for those propagated register-uses to ante late-combine. For reasons not investigated, this fixes a failing test "FAIL: gcc.dg/guality/pr54200.c -Og -DPREVENT_OPTIMIZATION line 20 z == 3" for x86_64-linux-gnu. PR middle-end/115883 * combine.cc (make_more_copies): Copy attributes from the original pseudo to the new copy.
2024-08-27c++/coros: do not assume coros don't nest [PR113457]Arsen Arsenović3-6/+216
In the testcase presented in the PR, during template expansion, an tsubst of an operand causes a lambda coroutine to be processed, causing it to get an initial suspend and final suspend. The code for assigning awaitable var names (get_awaitable_var) assumed that the sequence Is -> Is -> Fs -> Fs is impossible (i.e. that one could only 'open' one coroutine before closing it at a time), and reset the counter used for unique numbering each time a final suspend occured. This assumption is false in a few cases, usually when lambdas are involved. Instead of storing this counter in a static-storage variable, we can store it in coroutine_info. This struct is local to each function, so we don't need to worry about "cross-contamination" nor resetting. PR c++/113457 gcc/cp/ChangeLog: * coroutines.cc (struct coroutine_info): Add integer field awaitable_number. This is a counter used for assigning unique names to awaitable temporaries. (get_awaitable_var): Use awaitable_number from coroutine_info instead of the static int awn. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr113457-1.C: New test. * g++.dg/coroutines/pr113457.C: New test.
2024-08-26coroutines: diagnose usage of alloca in coroutinesArsen Arsenović2-0/+33
We do not support it currently, and the resulting memory can only be used inside a single resumption, so best not confuse the user with it. PR c++/115858 - Incompatibility of coroutines and alloca() gcc/ChangeLog: * coroutine-passes.cc (execute_early_expand_coro_ifns): Emit a sorry if a statement is an alloca call. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr115858.C: New test.
2024-08-26diagnostics: move output formats from diagnostic.{c,h} to their own filesDavid Malcolm12-257/+359
In particular, move the classic text output code to a diagnostic-text.cc (analogous to -json.cc and -sarif.cc). No functional change intended. gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add diagnostic-format-text.o. * diagnostic-format-json.cc: Include "diagnostic-format.h". * diagnostic-format-sarif.cc: Likewise. * diagnostic-format-text.cc: New file, using material from diagnostics.cc. * diagnostic-global-context.cc: Include "diagnostic-format.h". * diagnostic-format-text.h: New file, using material from diagnostics.h. * diagnostic-format.h: New file, using material from diagnostics.h. * diagnostic.cc: Include "diagnostic-format.h" and "diagnostic-format-text.h". (diagnostic_text_output_format::~diagnostic_text_output_format): Move to diagnostic-format-text.cc. (diagnostic_text_output_format::on_report_diagnostic): Likewise. (diagnostic_text_output_format::on_diagram): Likewise. (diagnostic_text_output_format::print_any_cwe): Likewise. (diagnostic_text_output_format::print_any_rules): Likewise. (diagnostic_text_output_format::print_option_information): Likewise. * diagnostic.h (class diagnostic_output_format): Move to diagnostic-format.h. (class diagnostic_text_output_format): Move to diagnostic-format-text.h. (diagnostic_output_format_init): Move to diagnostic-format.h. (diagnostic_output_format_init_json_stderr): Likewise. (diagnostic_output_format_init_json_file): Likewise. (diagnostic_output_format_init_sarif_stderr): Likewise. (diagnostic_output_format_init_sarif_file): Likewise. (diagnostic_output_format_init_sarif_stream): Likewise. * gcc.cc: Include "diagnostic-format.h". * opts.cc: Include "diagnostic-format.h". gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic_group_plugin.c: Include "diagnostic-format-text.h". Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26diagnostics: consolidate on_{begin,end}_diagnostic into on_report_diagnosticDavid Malcolm4-176/+171
Previously diagnostic_context::report_diagnostic had, after the call to pp_format (phases 1 and 2 of formatting the message): m_output_format->on_begin_diagnostic (*diagnostic); pp_output_formatted_text (this->printer, m_urlifier); if (m_show_cwe) print_any_cwe (*diagnostic); if (m_show_rules) print_any_rules (*diagnostic); if (m_show_option_requested) print_option_information (*diagnostic, orig_diag_kind); m_output_format->on_end_diagnostic (*diagnostic, orig_diag_kind); This patch replaces all of the above with a single call to m_output_format->on_report_diagnostic (*diagnostic, orig_diag_kind); moving responsibility for phase 3 of formatting and printing the result from diagnostic_context to the output format. This simplifies diagnostic_context::report_diagnostic and allows us to move the code that prints CWEs, rules, and option information in textual form from diagnostic_context to diagnostic_text_output_format, where it belongs. No functional change intended. gcc/ChangeLog: * diagnostic-format-json.cc (json_output_format::on_begin_diagnostic): Delete. (json_output_format::on_end_diagnostic): Rename to... (json_output_format::on_report_diagnostic): ...this and add call to pp_output_formatted_text. (diagnostic_output_format_init_json): Drop unnecessary calls to disable textual printing of CWEs, rules, and options. * diagnostic-format-sarif.cc (sarif_builder::end_diagnostic): Rename to... (sarif_builder::on_report_diagnostic): ...this and add call to pp_output_formatted_text. (sarif_output_format::on_begin_diagnostic): Delete. (sarif_output_format::on_end_diagnostic): Rename to... (sarif_output_format::on_report_diagnostic): ...this and update call to m_builder accordingly. (diagnostic_output_format_init_sarif): Drop unnecessary calls to disable textual printing of CWEs, rules, and options. * diagnostic.cc (diagnostic_context::print_any_cwe): Convert to... (diagnostic_text_output_format::print_any_cwe): ...this. (diagnostic_context::print_any_rules): Convert to... (diagnostic_text_output_format::print_any_rules): ...this. (diagnostic_context::print_option_information): Convert to... (diagnostic_text_output_format::print_option_information): ...this. (diagnostic_context::report_diagnostic): Replace calls to the output format's on_begin_diagnostic, to pp_output_formatted_text, printing CWE, rules, option info, and the call to the format's on_end_diagnostic with a call to the format's on_report_diagnostic. (diagnostic_text_output_format::on_begin_diagnostic): Delete. (diagnostic_text_output_format::on_end_diagnostic): Delete. (diagnostic_text_output_format::on_report_diagnostic): New vfunc, which effectively does the on_begin_diagnostic, the call to pp_output_formatted_text, the calls for printing CWE, rules, option info, and the call to the diagnostic_finalizer. * diagnostic.h (diagnostic_output_format::on_begin_diagnostic): Delete. (diagnostic_output_format::on_end_diagnostic): Delete. (diagnostic_output_format::on_report_diagnostic): New. (diagnostic_text_output_format::on_begin_diagnostic): Delete. (diagnostic_text_output_format::on_end_diagnostic): Delete. (diagnostic_text_output_format::on_report_diagnostic): New. (class diagnostic_context): Add friend class diagnostic_text_output_format. (diagnostic_context::get_urlifier): New accessor. (diagnostic_context::print_any_cwe): Move decl... (diagnostic_text_output_format::print_any_cwe): ...to here. (diagnostic_context::print_any_rules): Move decl... (diagnostic_text_output_format::print_any_rules): ...to here. (diagnostic_context::print_option_information): Move decl... (diagnostic_text_output_format::print_option_information): ...to here. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26testsuite: add event IDs to multithreaded event plugin testDavid Malcolm4-20/+38
Add test coverage of "%@" in event messages in a multithreaded execution path. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-paths-multithreaded-inline-events.c: Update expected output. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py: Likewise. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-separate-events.c: Likewise. * gcc.dg/plugin/diagnostic_plugin_test_paths.c (test_diagnostic_path::add_event_2): Return the id of the added event. (test_diagnostic_path::add_event_2_with_event_id): New. (example_4): Add event IDs to the deadlock messages indicating where the locks where acquired. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26testsuite: generalize support for Python tests for SARIF outputDavid Malcolm8-16/+208
In r15-2354-g4d1f71d49e396c I added the ability to use Python to write tests of SARIF output via a new "run-sarif-pytest" based on "run-gcov-pytest", with a sarif.py support script in testsuite/gcc.dg/sarif-output. This followup patch: (a) removes the limitation of such tests needing to be in testsuite/gcc.dg/sarif-output by moving sarif.py to testsuite/lib and adding logic to add that directory to PYTHONPATH when invoking pytest. (b) uses this to replace fragile regexp-based tests in gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c with Python logic that verifies the structure within the generated JSON, and to add test coverage for SARIF output relating to GCC plugins. gcc/ChangeLog: * diagnostic-format-sarif.cc: Add comments noting that we don't yet capture any diagnostic_metadata::rules associated with a diagnostic. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-metadata-sarif.c: New test, based on diagnostic-test-metadata.c. * gcc.dg/plugin/diagnostic-test-metadata-sarif.py: New script. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c: Replace scan-sarif-file directives with run-sarif-pytest, to run... * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py: ...this new test. * gcc.dg/plugin/plugin.exp (plugin_test_list): Add diagnostic-test-metadata-sarif.c. * gcc.dg/sarif-output/sarif.py: Move to... * lib/sarif.py: ...here. * lib/scansarif.exp (run-sarif-pytest): Prepend "lib" to PYTHONPATH before running python scripts. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26pretty-print: fixes to selftestsDavid Malcolm1-4/+35
Add selftest coverage for %{ and %} in pretty-print.cc No functional change intended. gcc/ChangeLog: * pretty-print.cc (selftest::test_urls): Make static. (selftest::test_urls_from_braces): New. (selftest::test_null_urls): Make static. (selftest::test_urlification): Likewise. (selftest::pretty_print_cc_tests): Call test_urls_from_braces. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26json.h: fix typo in commentDavid Malcolm1-1/+1
gcc/ChangeLog: * json.h: Fix typo in comment about missing INCLUDE_MEMORY. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26c++: Check template parameters in member class template specialization ↵Simon Martin3-0/+40
[PR115716] We currently ICE upon the following invalid code, because we don't check that the template parameters in a member class template specialization are correct. === cut here === template <typename T> struct x { template <typename U> struct y { typedef T result2; }; }; template<> template<typename U, typename> struct x<int>::y { typedef double result2; }; int main() { x<int>::y<int>::result2 xxx2; } === cut here === This patch fixes the PR by calling redeclare_class_template. PR c++/115716 gcc/cp/ChangeLog: * pt.cc (maybe_process_partial_specialization): Call redeclare_class_template. gcc/testsuite/ChangeLog: * g++.dg/template/spec42.C: New test. * g++.dg/template/spec43.C: New test.
2024-08-26Remove an unneeded include that was added by mistake.Andi Kleen1-1/+0
gcc/ChangeLog: * tree-if-conv.cc: Remove unneeded include from last change.
2024-08-26Fix bootstap-errors due to enabling -gvariable-location-viewsBernd Edlinger3-3/+3
This recent change triggered various bootstap-errors, mostly on x86 targets because line info advance address entries were output in the wrong section table. The switch to the wrong line table happened in dwarfout_set_ignored_loc. It must use the same section as the earlier called dwarf2out_switch_text_section. But also ft32-elf was affected, because the assembler choked on something simple as ".2byte .LM2-.LM1", but fortunately it is able to use native location views, the configure test was just not executed because the ft32 "nop" instruction was missing. gcc/ChangeLog: PR debug/116470 * configure.ac: Add the "nop" instruction for cpu type ft32. * configure: Regenerate. * dwarf2out.cc (dwarf2out_set_ignored_loc): Use the correct line info section.
2024-08-26libcpp: deduplicate definition of padding sizeAlexander Monakov4-13/+11
Tie together the two functions that ensure tail padding with search_line_ssse3 via CPP_BUFFER_PADDING macro. libcpp/ChangeLog: * internal.h (CPP_BUFFER_PADDING): New macro; use it ... * charset.cc (_cpp_convert_input): ...here, and ... * files.cc (read_file_guts): ...here, and ... * lex.cc (search_line_ssse3): here.
2024-08-26tree-optimization/116460 - improve forwprop compile-timeRichard Biener1-6/+7
The following improves forwprop block reachability which I noticed when debugging PR116460 and what is also noted in the comment. It avoids processing blocks in natural loops determined unreachable, thereby making the issue in PR116460 latent. PR tree-optimization/116460 * tree-ssa-forwprop.cc (pass_forwprop::execute): Do not process blocks in unreachable natural loops.
2024-08-26Delay edge removal in forwpropRichard Biener1-9/+25
SSA forwprop has switch simplification code that calls remove edge and as side-effect releases dominator info. For a followup we want to retain that so the following delays removing edges until the end of the pass. As usual we have to deal with parts of the edge vanishing due to EH/abnormal pruning so record edges as basic-block index pairs and remove them only when they are still there. * tree-ssa-forwprop.cc (simplify_gimple_switch_label_vec): Delay removing edges and releasing dominator info, instead record into edges_to_remove vector. (simplify_gimple_switch): Pass through vector of to remove edges. (pass_forwprop::execute): Likewise. Remove queued edges.
2024-08-26vect: Fix STMT_VINFO_DEF_TYPE check for odd/even widen mult [PR116348]Xi Ruoyao2-2/+15
After fixing PR116142 some code started to trigger an ICE with -O3 -march=znver4. Per Richard Biener who actually made this fix: "supportable_widening_operation fails at transform time - that's likely because vectorizable_reduction "puns" defs to internal_def" so the check should use STMT_VINFO_REDUC_DEF instead of checking if STMT_VINFO_DEF_TYPE is vect_reduction_def. gcc/ChangeLog: PR tree-optimization/116348 * tree-vect-stmts.cc (supportable_widening_operation): Use STMT_VINFO_REDUC_DEF (x) instead of STMT_VINFO_DEF_TYPE (x) == vect_reduction_def. gcc/testsuite/ChangeLog: PR tree-optimization/116348 * gcc.c-torture/compile/pr116438.c: New test. Co-authored-by: Richard Biener <rguenther@suse.de>
2024-08-26Match: Add int type fits check for .SAT_ADD imm operandPan Li58-9/+443
This patch would like to add strict check for imm operand of .SAT_ADD matching. We have no type checking for imm operand in previous, which may result in unexpected IL to be catched by .SAT_ADD pattern. We leverage the int_fits_type_p here to make sure the imm operand is a int type fits the result type of the .SAT_ADD. For example: Fits uint8_t: uint8_t a; uint8_t sum = .SAT_ADD (a, 12); uint8_t sum = .SAT_ADD (a, 12u); uint8_t sum = .SAT_ADD (a, 126u); uint8_t sum = .SAT_ADD (a, 128u); uint8_t sum = .SAT_ADD (a, 228); uint8_t sum = .SAT_ADD (a, 223u); Not fits uint8_t: uint8_t a; uint8_t sum = .SAT_ADD (a, -1); uint8_t sum = .SAT_ADD (a, 256u); uint8_t sum = .SAT_ADD (a, 257); The below test suite are passed for this patch: * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add int_fits_type_p check for .SAT_ADD imm operand. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_u_add_imm-11.c: Adjust test case for imm. * gcc.target/riscv/sat_u_add_imm-12.c: Ditto. * gcc.target/riscv/sat_u_add_imm-15.c: Ditto. * gcc.target/riscv/sat_u_add_imm-16.c: Ditto. * gcc.target/riscv/sat_u_add_imm_type_check-1.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-10.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-11.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-12.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-13.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-14.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-15.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-16.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-17.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-18.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-19.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-2.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-20.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-21.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-22.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-23.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-24.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-25.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-26.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-27.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-28.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-29.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-3.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-30.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-31.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-32.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-33.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-34.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-35.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-36.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-37.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-38.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-39.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-4.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-40.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-41.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-42.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-43.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-44.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-45.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-46.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-47.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-48.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-49.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-5.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-50.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-51.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-52.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-6.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-7.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-8.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-9.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26expand: Use the correct mode for store flags for popcount [PR116480]Andrew Pinski3-1/+18
When expanding popcount used for equal to 1 (or rather __builtin_stdc_has_single_bit), the wrong mode was bsing used for the mode of the store flags. We were using the mode of the argument to popcount but since popcount's return value is always int, the mode of the expansion here should have been the mode of the return type rater than the argument. Built and tested on aarch64-linux-gnu with no regressions. Also bootstrapped and tested on x86_64-linux-gnu. PR middle-end/116480 gcc/ChangeLog: * internal-fn.cc (expand_POPCOUNT): Use the correct mode for store flags. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr116480-1.c: New test. * gcc.dg/torture/pr116480-2.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-08-26i386: Add bf8 -> fp16 intrinHaochen Jiang4-5/+109
Since BF8 and FP16 have same bits for exponent, the type conversion between them is just a cast for fraction part. We will use a sequence of instrctions instead of new instructions to do that. For convenience, intrins are also provided. gcc/ChangeLog: * config/i386/avx10_2-512convertintrin.h (_mm512_cvtpbf8_ph): New. (_mm512_mask_cvtpbf8_ph): Ditto. (_mm512_maskz_cvtpbf8_ph): Ditto. * config/i386/avx10_2convertintrin.h (_mm_cvtpbf8_ph): Ditto. (_mm_mask_cvtpbf8_ph): Ditto. (_mm_maskz_cvtpbf8_ph): Ditto. (_mm256_cvtpbf8_ph): Ditto. (_mm256_mask_cvtpbf8_ph): Ditto. (_mm256_maskz_cvtpbf8_ph): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-convert-1.c: Add tests for new intrin. * gcc.target/i386/avx10_2-convert-1.c: Ditto.
2024-08-26AVX10.2: Support compare instructionsZhang, Jun4-27/+183
gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_ssecom_setcc): Mention behavior change on flags. (ix86_expand_sse_comi): Handle AVX10.2 behavior. (ix86_expand_sse_comi_round): Ditto. (ix86_expand_round_builtin): Ditto. (ix86_expand_builtin): Change function call. * config/i386/i386.md (UNSPEC_COMX): New unspec. * config/i386/sse.md (avx10_2_v<unord>comx<ssemodesuffix><round_saeonly_name>): New. (<sse>_<unord>comi<round_saeonly_name>): Add HFmode. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-compare-1.c: New test. Co-authored-by: Haochen Jiang <haochen.jiang@intel.com> Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
2024-08-26AVX10.2: Support vector copy instructionsZhang, Jun9-53/+356
gcc/ChangeLog: * config.gcc: Add avx10_2copyintrin.h. * config/i386/i386.md (avx10_2): New isa attribute. * config/i386/immintrin.h: Include avx10_2copyintrin.h. * config/i386/sse.md (sse_movss_<mode>): Add new constraints to handle AVX10.2. (vec_set<mode>_0): Ditto. (@vec_set<mode>_0): Ditto. (vec_set<mode>_0): Ditto. (avx512fp16_mov<mode>): Ditto. (*vec_set<mode>_0_1): New split. * config/i386/avx10_2copyintrin.h: New file. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-vmovd-1.c: New test. * gcc.target/i386/avx10_2-vmovd-2.c: Ditto. * gcc.target/i386/avx10_2-vmovw-1.c: Ditto. * gcc.target/i386/avx10_2-vmovw-2.c: Ditto.
2024-08-26AVX10.2: Support minmax instructionsMo, Zewei28-1/+2555
gcc/ChangeLog: * config.gcc: Add avx10_2-512minmaxintrin.h and avx10_2minmaxintrin.h. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (V8BF, V8BF, V8BF, INT, V8BF, UQI), (V16BF, V16BF, V16BF, INT, V16BF, UHI), (V32BF, V32BF, V32BF, INT, V32BF, USI), (V8HF, V8HF, V8HF, INT, V8HF, UQI), (V8DF, V8DF, V8DF, INT, V8DF, UQI, INT), (V32HF, V32HF, V32HF, INT, V32HF, USI, INT), (V16HF, V16HF, V16HF, INT, V16HF, UHI, INT), (V16SF, V16SF, V16SF, INT, V16SF, UHI, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle V8BF_FTYPE_V8BF_V8BF_INT_V8BF_UQI, V16BF_FTYPE_V16BF_V16BF_INT_V16BF_UHI, V32BF_FTYPE_V32BF_V32BF_INT_V32BF_USI, V8HF_FTYPE_V8HF_V8HF_INT_V8HF_UQI, (ix86_expand_round_builtin): Handle V8DF_FTYPE_V8DF_V8DF_INT_V8DF_UQI_INT, V32HF_FTYPE_V32HF_V32HF_INT_V32HF_USI_INT, V16HF_FTYPE_V16HF_V16HF_INT_V16HF_UHI_INT. V16SF_FTYPE_V16SF_V16SF_INT_V16SF_UHI_INT. * config/i386/immintrin.h: Include avx10_2-512mixmaxintrin.h and avx10_2minmaxintrin.h. * config/i386/sse.md (VFH_AVX10_2): New. (avx10_2_vminmaxnepbf16_<mode><mask_name>): New define_insn. (avx10_2_minmaxp<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_minmaxs<mode><mask_scalar_name><round_saeonly_scalar_name>): Ditto. * config/i386/avx10_2-512minmaxintrin.h: New file. * config/i386/avx10_2minmaxintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx512f-helper.h: Add helper function. * gcc.target/i386/avx10-minmax-helper.h: New helper file. * gcc.target/i386/avx10_2-512-minmax-1.c: New test. * gcc.target/i386/avx10_2-512-vminmaxnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxps-2.c: Ditto. * gcc.target/i386/avx10_2-minmax-1.c: Ditto. * gcc.target/i386/avx10_2-vminmaxnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsh-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxss-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto. Co-authored-by: Hu, Lin1 <lin1.hu@intel.com> Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
2024-08-26[PATCH 2/2] AVX10.2: Support saturating convert instructionsHu, Lin131-1/+2830
gcc/ChangeLog: * config/i386/avx10_2-512satcvtintrin.h: Add new intrin. * config/i386/avx10_2satcvtintrin.h: Ditto. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md (VF1_VF2_AVX10_2): New iterator. (VF2_AVX10_2): Ditto. (VI8_AVX10_2): Ditto. (sat_cvt_sign_prefix): Add new UNSPEC. (UNSPEC_SAT_CVT_DS_SIGN_ITER): New iterator. (pd2dqssuff): Ditto. (avx10_2_vcvtt<castmode>2<sat_cvt_sign_prefix>dqs<mode><mask_name><round_saeonly_name>): New. (avx10_2_vcvttpd2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_vcvttps2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_vcvttsd2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>): Ditto. (avx10_2_vcvttss2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-satcvt-1.c: Add test. * gcc.target/i386/avx10_2-512-satcvt-1.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2dqs-2.c: New test. * gcc.target/i386/avx10_2-512-vcvttpd2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2usis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2usis-2.c: Ditto.
2024-08-26[PATCH 1/2] AVX10.2: Support saturating convert instructionsHu, Lin140-1/+3327
gcc/ChangeLog: * config.gcc: Add avx10_2satcvtintrin.h and avx10_2-512satcvtintrin.h. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (V8HI, V8BF, V8HI, UQI), (V16HI, V16BF, V16HI, UHI), (V32HI, V32BF, V32HI, USI), (V16SI, V16SF, V16SI, UHI, INT), (V16HI, V16BF, V16HI, UHI, INT), (V32HI, V32BF, V32HI, USI, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle V32HI_FTYPE_V32BF_V32HI_USI, V16HI_FTYPE_V16BF_V16HI_UHI, V8HI_FTYPE_V8BF_V8HI_UQI. (ix86_expand_round_builtin): Handle V32HI_FTYPE_V32BF_V32HI_USI_INT, V16SI_FTYPE_V16SF_V16SI_UHI_INT, V16HI_FTYPE_V16BF_V16HI_UHI_INT. * config/i386/immintrin.h: Include avx10_2satcvtintrin.h and avx10_2-512savcvtintrin.h. * config/i386/sse.md: (UNSPEC_CVTNE_BF16_IBS_ITER): New iterator. (sat_cvt_sign_prefix): Ditto. (sat_cvt_trunc_prefix): Ditto. (UNSPEC_CVT_PH_IBS_ITER): Ditto. (UNSPEC_CVTT_PH_IBS_ITER): Ditto. (UNSPEC_CVT_PS_IBS_ITER): Ditto. (UNSPEC_CVTT_PS_IBS_ITER): Ditto. (avx10_2_cvt<sat_cvt_trunc_prefix>nebf162i<sat_cvt_sign_prefix>bs<mode><mask_name>): New define_insn. (avx10_2_cvtph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>): Ditto. (avx10_2_cvttph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_cvtps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>): Ditto. (avx10_2_cvttps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>): Ditto. * config/i386/avx10_2-512satcvtintrin.h: New file. * config/i386/avx10_2satcvtintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx512f-helper.h: Add new test macro. * gcc.target/i386/m512-check.h: Add new type. * gcc.target/i386/avx10_2-512-satcvt-1.c: New test. * gcc.target/i386/avx10_2-512-vcvtnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtps2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-satcvt-1.c: Ditto. * gcc.target/i386/avx10_2-vcvtnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2iubs-2.c: Ditto.
2024-08-26[PATCH 2/2] AVX10.2: Support BF16 instructionskonglin135-2/+2096
gcc/ChangeLog: * config/i386/avx10_2-512bf16intrin.h: Add new intrinsics. * config/i386/avx10_2bf16intrin.h: Diito. * config/i386/i386-builtin-types.def : Add new DEF_FUNCTION_TYPE for new type. * config/i386/i386-builtin.def (BDESC): Add new buildin. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle new type. * config/i386/sse.md (vecmemsuffix): Add vector BF mode. (avx10_2_rsqrtpbf16_<mode><mask_name>): New define_insn. (avx10_2_sqrtnepbf16_<mode><mask_name>): Ditto. (avx10_2_rcppbf16_<mode><mask_name>): Ditto. (avx10_2_getexppbf16_<mode><mask_name>): Ditto. (BF16IMMOP): New iterator. (bf16immop): Ditto. (avx10_2_<bf16immop>pbf16_<mode><mask_name>): New define_insn. (avx10_2_fpclasspbf16_<mode><mask_scalar_merge_name>): Ditto. (avx10_2_cmppbf16_<mode><mask_scalar_merge_name>): Ditto. (avx10_2_comsbf16_v8bf): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10-check.h: Add AVX10_SCALAR. * gcc.target/i386/avx10-helper.h: Add helper functions. * gcc.target/i386/avx10_2-512-bf16-1.c: Add new tests. * gcc.target/i386/avx10_2-bf16-1.c: Ditto. * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-512-vcmppbf16-2.c: New test. * gcc.target/i386/avx10_2-512-vfpclasspbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vgetexppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vgetmantpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vrcppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vreducenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vrndscalenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vrsqrtpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vsqrtnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vcmppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vcomsbf16-1.c: Ditto. * gcc.target/i386/avx10_2-vcomsbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfpclasspbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vgetexppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vgetmantpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrcppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vreducenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrndscalenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrsqrtpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vsqrtnepbf16-2.c: Ditto. Co-authored-by: Levy Hsu <admin@levyhsu.com>
2024-08-26[PATCH 1/2] AVX10.2: Support BF16 instructionskonglin135-2/+2514
gcc/ChangeLog: * config.gcc: Add avx10_2-512bf16intrin.h and avx10_2bf16intrin.h. * config/i386/i386-builtin-types.def : Add new DEF_FUNCTION_TYPE for V32BF_FTYPE_V32BF_V32BF, V16BF_FTYPE_V16BF_V16BF, V8BF_FTYPE_V8BF_V8BF, V8BF_FTYPE_V8BF_V8BF_UQI, V16BF_FTYPE_V16BF_V16BF_UHI, V32BF_FTYPE_V32BF_V32BF_USI, V32BF_FTYPE_V32BF_V32BF_V32BF_USI, V8BF_FTYPE_V8BF_V8BF_V8BF_UQI and V16BF_FTYPE_V16BF_V16BF_V16BF_UHI. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle new DEF_FUNCTION_TYPE. * config/i386/immintrin.h: Include avx10_2-512bf16intrin.h and avx10_2bf16intrin.h. * config/i386/sse.md (VBF_AVX10_2): New iterator. (avx10_2_scalefpbf16_<mode><mask_name>): New define_insn. (avx10_2_<code>nepbf16_<mode><mask_name>): Ditto. (avx10_2_<insn>nepbf16_<mode><mask_name>): Ditto. (avx10_2_fmaddnepbf16_<mode>_maskz): New expander. (avx10_2_fnmaddnepbf16_<mode>_maskz): Ditto. (avx10_2_fmsubnepbf16_<mode>_maskz): Ditto. (avx10_2_fnmsubnepbf16_<mode>_maskz): Ditto. (avx10_2_fmaddnepbf16_<mode><sd_maskz_name>): New define_insn. (avx10_2_fmaddnepbf16_<mode>_mask): Ditto. (avx10_2_fmaddnepbf16_<mode>_mask3): Ditto. (avx10_2_fnmaddnepbf16_<mode><sd_maskz_name>): Ditto. (avx10_2_fnmaddnepbf16_<mode>_mask): Ditto. (avx10_2_fnmaddnepbf16_<mode>_mask3): Ditto. (avx10_2_fmsubnepbf16_<mode><sd_maskz_name>): Ditto. (avx10_2_fmsubnepbf16_<mode>_mask): Ditto. (avx10_2_fmsubnepbf16_<mode>_mask3): Ditto. (avx10_2_fnmsubnepbf16_<mode><sd_maskz_name>): Ditto. (avx10_2_fnmsubnepbf16_<mode>_mask): Ditto. (avx10_2_fnmsubnepbf16_<mode>_mask3): Ditto. * config/i386/avx10_2-512bf16intrin.h: New file. * config/i386/avx10_2bf16intrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512f-helper.h: Add MAKE_MASK_MERGE and MAKE_MASK_ZERO for bf16_uw. * gcc.target/i386/m512-check.h: Add union512bf16_uw, union256bf16_uw, union128bf16_uw and CHECK_EXP for them. * gcc.target/i386/avx10-helper.h: New file. * gcc.target/i386/avx10_2-512-bf16-1.c: New test. * gcc.target/i386/avx10_2-512-vaddnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vdivnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfnmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfnmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vmaxpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vscalefpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vsubnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-bf16-1.c: Ditto. * gcc.target/i386/avx10_2-vaddnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vdivnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfnmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfnmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vmaxpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vmulnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vscalefpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vsubnepbf16-2.c: Ditto. Co-authored-by: Levy Hsu <admin@levyhsu.com>
2024-08-26AVX10.2: Support convert instructionsLevy Hsu45-5/+3511
gcc/ChangeLog: * config.gcc: Add avx10_2-512convertintrin.h and avx10_2convertintrin.h. * config/i386/i386-builtin-types.def: Add new DEF_POINTER_TYPE and DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle AVX10.2. (ix86_expand_round_builtin): Ditto. * config/i386/immintrin.h: Include avx10_2-512convertintrin.h, avx10_2convertintrin.h. * config/i386/sse.md (VHF_AVX10_2): New iterator. (bf16_ph): Add 512 bit mode. (avx10_2_cvt2ps2phx_<mode><mask_name<round_name>): New define_insn. (ssebvecmode): New iterator. (UNSPEC_NECONVERTFP8_PACK): Ditto. (neconvertfp8_pack): Ditto. (vcvt<neconvertfp8_pack><mode><mask_name>): New define_insn. (ssebvecmode_2): New iterator. (UNSPEC_VCVTBIASPH2FP8_PACK): Ditto. (biasph2fp8_pack): Ditto. (vcvt<biasph2fp8_pack>v8hf): New expander. (vcvt<biasph2fp8_pack>v8hf_mask): Ditto. (*vcvt<biasph2bf8_pack>v8hf): New define_insn. (*vcvt<biasph2fp8_pack>v8hf_mask): Ditto. (VHF_AVX10_2_2): New iterator. (vcvt<biasph2fp8_pack><mode><mask_name>): New define_insn. (VHF_256_512): New iterator. (ph2fp8suff): Ditto. (UNSPEC_NECONVERTPH2FP8_PACK): Ditto. (neconvertph2fp8): Ditto. (vcvt<neconvertph2fp8>v8hf_mask): New expander. (*vcvt<neconvertph2fp8>v8hf): New define_insn. (*vcvt<neconvertph2fp8>v8hf_mask): Ditto. (vcvt<neconvertph2fp8><mode><mask_name>): Ditto. (vcvthf82ph<mode><mask_name>): Ditto. * config/i386/avx10_2-512convertintrin.h: New file. * config/i386/avx10_2convertintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros for const. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-512-convert-1.c: New test. * gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvthf82ph-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-convert-1.c: Ditto. * gcc.target/i386/avx10_2-vcvt2ps2phx-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvthf82ph-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2hf8s-2.c: Ditto. * gcc.target/i386/fp8-helper.h: New helper file. Co-authored-by: Levy Hsu <admin@levyhsu.com> Co-authored-by: Kong Lingling <lingling.kong@intel.com>
2024-08-26[PATCH 2/2] AVX10.2: Support media instructionsHaochen Jiang32-35/+1953
gcc/ChangeLog: * config/i386/avx10_2-512mediaintrin.h: Add new intrins. * config/i386/avx10_2mediaintrin.h: Ditto. * config/i386/i386-builtin.def: Add new builtins. * config/i386/i386-builtins.cc (def_builtin): Handle shared builtins between AVXVNNIINT16 and AVX10.2. * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): Ditto. * config/i386/sse.md (unspec): Add UNSPEC_VDPPHPS. (avx10_2_mpsadbw<mask_name>): New define_insn. (<mask_codefor><sse4_1_avx2>_mpsadbw<mask_name>): Ditto. (vpdp<vpdpwprodtype>_<mode>): Add AVX10_2_256. (vpdp<vpdpwprodtype>_v16si): New defin_insn. (vpdp<vpdpwprodtype>_<mode>_mask): Ditto. (*vpdp<vpdpwprodtype>_<mode>_maskz): Ditto. (vpdp<vpdpwprodtype>_<mode>_maskz): New expander. (vdpphps_<mode>): New define_insn. (vdpphps_<mode>_mask): Ditto. (*vdpphps_<mode>_maskz): Ditto. (vdpphps_<mode>_maskz): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/avxvnniint16-1.c: Add new macro test. * gcc.target/i386/avx-1.c: Ditto. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-512-media-1.c: Add test. * gcc.target/i386/avx10_2-media-1.c: Ditto. * gcc.target/i386/avxvnniint16-builtin.c: New test. * gcc.target/i386/avx10_2-512-vdpphps-2.c: Ditto. * gcc.target/i386/avx10_2-512-vmpsadbw-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwsud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwsuds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwusd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwusds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwuud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwuuds-2.c: Ditto. * gcc.target/i386/avx10_2-builtin-2.c: Ditto. * gcc.target/i386/avx10_2-vdpphps-2.c: Ditto. * gcc.target/i386/avx10_2-vmpsadbw-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwsud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwsuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwusd-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwusds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwuud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwuuds-2.c: Ditto. Co-authored-by: Hongyu Wang <hongyu.wang@intel.com>
2024-08-26[PATCH 1/2] AVX10.2: Support media instructionsHongyu Wang30-24/+1577
gcc/ChangeLog * config.gcc: Add avx10_2mediaintrin.h and avx10_2-512mediaintrin.h. * config/i386/i386-builtin.def: Add new builtins. * config/i386/i386-builtins.cc (def_builtin): Handle shared builtins between AVXVNNIINT8 and AVX10.2. * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): Ditto. * config/i386/immintrin.h: Include avx10_2mediaintrin.h and avx10_2-512mediaintrin.h * config/i386/sse.md: (VI4_AVX10_2): New. (vpdp<vpdotprodtype>_<mode>): Add AVX10_2_256. (vpdp<vpdotprodtype>_v16si): New define_insn. (vpdp<vpdotprodtype>_<mode>_mask): Ditto. (*vpdp<vpdotprodtype>_<mode>_maskz): Ditto. (vpdp<vpdotprodtype>_<mode>_maskz): New expander. * config/i386/avx10_2-512mediaintrin.h: New file. * config/i386/avx10_2mediaintrin.h: Ditto. gcc/testsuite/ChangeLog * gcc.target/i386/avx512f-helper.h: Reuse AVX512F macros for AVX10. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * lib/target-supports.exp (check_effective_target_avx10_2): New. (check_effective_target_avx10_2_512): Ditto. * gcc.target/i386/avx10-check.h: New test file. * gcc.target/i386/avx10-helper.h: Ditto. * gcc.target/i386/avx10_2-builtin-1.c: Ditto. * gcc.target/i386/avx10_2-512-media-1.c: Ditto. * gcc.target/i386/avx10_2-media-1.c: Ditto.. * gcc.target/i386/avxvnniint8-builtin.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbssd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbssds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbsud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbsuds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbuud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbuuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbssd-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbssds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbsud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbsuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbuud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbuuds-2.c: Ditto. Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
2024-08-26i386: Refactor m512-check.hHaochen Jiang1-31/+35
After AVX10 introduction, we still want to use AVX512 helper functions to avoid duplicate code. In order to reuse them, we need to do some refactor to make sure each function define happen under correct ISA to avoid ABI warnings. gcc/testsuite/ChangeLog: * gcc.target/i386/m512-check.h: Wrap the function define with correct vector size.
2024-08-26RISC-V: Support IMM for operand 0 of ussub patternPan Li17-2/+477
This patch would like to allow IMM for the operand 0 of ussub pattern. Aka .SAT_SUB(1023, y) as the below example. Form 1: #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return (T)IMM >= y ? (T)IMM - y : 0; \ } DEF_SAT_U_SUB_IMM_FMT_1(uint64_t, 1023) Before this patch: 10 │ sat_u_sub_imm82_uint64_t_fmt_1: 11 │ li a5,82 12 │ bgtu a0,a5,.L3 13 │ sub a0,a5,a0 14 │ ret 15 │ .L3: 16 │ li a0,0 17 │ ret After this patch: 10 │ sat_u_sub_imm82_uint64_t_fmt_1: 11 │ li a5,82 12 │ sltu a4,a5,a0 13 │ addi a4,a4,-1 14 │ sub a0,a5,a0 15 │ and a0,a4,a0 16 │ ret The below test suites are passed for this patch: 1. The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_gen_unsigned_xmode_reg): Add new func impl to gen xmode rtx reg from operand rtx. (riscv_expand_ussub): Gen xmode reg for operand 1. * config/riscv/riscv.md: Allow const_int for operand 1. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macro. * gcc.target/riscv/sat_u_sub_imm-1.c: New test. * gcc.target/riscv/sat_u_sub_imm-1_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-1_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-2.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-3.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-4.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-1.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-2.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-3.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-4.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 4Pan Li13-0/+236
This patch would like to add test cases for the unsigned vector .SAT_TRUNC form 4. Aka: Form 4: #define DEF_VEC_SAT_U_TRUNC_FMT_4(NT, WT) \ void __attribute__((noinline)) \ vec_sat_u_trunc_##NT##_##WT##_fmt_4 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ bool not_overflow = in[i] <= (WT)(NT)(-1); \ out[i] = ((NT)in[i]) | (NT)((NT)not_overflow - 1); \ } \ } DEF_VEC_SAT_U_TRUNC_FMT_4 (uint32_t, uint64_t) The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-19.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-20.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-21.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-22.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-23.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-24.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-19.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-20.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-21.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-22.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-23.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-24.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26RISC-V: Add testcases for unsigned scalar .SAT_TRUNC form 4Pan Li13-0/+218
This patch would like to add test cases for the unsigned scalar quad and oct .SAT_TRUNC form 4. Aka: Form 4: #define DEF_SAT_U_TRUNC_FMT_4(NT, WT) \ NT __attribute__((noinline)) \ sat_u_trunc_##WT##_to_##NT##_fmt_4 (WT x) \ { \ bool not_overflow = x <= (WT)(NT)(-1); \ return ((NT)x) | (NT)((NT)not_overflow - 1); \ } The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_u_trunc-19.c: New test. * gcc.target/riscv/sat_u_trunc-20.c: New test. * gcc.target/riscv/sat_u_trunc-21.c: New test. * gcc.target/riscv/sat_u_trunc-22.c: New test. * gcc.target/riscv/sat_u_trunc-23.c: New test. * gcc.target/riscv/sat_u_trunc-24.c: New test. * gcc.target/riscv/sat_u_trunc-run-19.c: New test. * gcc.target/riscv/sat_u_trunc-run-20.c: New test. * gcc.target/riscv/sat_u_trunc-run-21.c: New test. * gcc.target/riscv/sat_u_trunc-run-22.c: New test. * gcc.target/riscv/sat_u_trunc-run-23.c: New test. * gcc.target/riscv/sat_u_trunc-run-24.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26Daily bump.GCC Administrator3-1/+143
2024-08-25RISC-V: Fix double mode under RV32 not utilize vfdemin.han33-68/+69
Currently, some binops of vector vs double scalar under RV32 can't translated to vf but vfmv+vxx.vv. The cause is that vec_duplicate is also expanded to broadcast for double mode under RV32. last-combine can't process expanded broadcast. gcc/ChangeLog: * config/riscv/vector.md: Add !FLOAT_MODE_P constraint. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto.
2024-08-25[PATCH] Re-add calling emit_clobber in lower-subreg.cc's resolve_simple_move.Xianmiao Qu2-0/+19
The previous patch: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d8a6945c6ea22efa4d5e42fe1922d2b27953c8cd aimed to eliminate redundant MOV instructions by removing calling emit_clobber in lower-subreg.cc's resolve_simple_move. First, I found that another patch address this issue: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=bdf2737cda53a83332db1a1a021653447b05a7e7 and even without removing calling emit_clobber, the instruction generation is still as expected. Second, removing the CLOBBER expression will have side effects. When there is no CLOBBER expression and only SUBREG assignments exist, according to the logic of the 'df_lr_bb_local_compute' function, the register will be added to the basic block LR IN set. This will cause the register's lifetime to span the entire function, resulting in increased register pressure. Taking the newly added test case 'gcc/testsuite/gcc.target/riscv/pr43644.c' as an example, removing the CLOBBER expression will lead to spill in some registers. gcc/: * lower-subreg.cc (resolve_simple_move): Re-add calling emit_clobber immediately before moving a multi-word register by parts. gcc/testsuite/: * gcc.target/riscv/pr43644.c: New test case.
2024-08-25testsuite: Run array54.C only for sync_int_long targetsDimitar Dimitrov1-0/+1
The test case uses "atomic<int>", which fails to link on pru-unknown-elf target due to missing __atomic_load_4 symbol. Fix by filtering for sync_int_long effective target. Ensured that the test still passes for x86_64-pc-linux-gnu. gcc/testsuite/ChangeLog: * g++.dg/init/array54.C: Require sync_int_long effective target. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2024-08-25Support if conversion for switchesAndi Kleen5-6/+270
The gimple-if-to-switch pass converts if statements with multiple equal checks on the same value to a switch. This breaks vectorization which cannot handle switches. Teach the tree-if-conv pass used by the vectorizer to handle simple switch statements, like those created by if-to-switch earlier. These are switches that only have a single non default block, They are handled similar to COND in if conversion. This makes the vect-bitfield-read-1-not test fail. The test checks for a bitfield analysis failing, but it actually relied on the ifcvt erroring out early because the test is using a switch. The if conversion still does not work because the switch is not in a form that this patch can handle, but it fails much later and the bitfield analysis succeeds, which makes the test fail. I marked it xfail because it doesn't seem to be testing what it wants to test. PR tree-optimization/115866 gcc/ChangeLog: * tree-if-conv.cc (if_convertible_switch_p): New function. (if_convertible_stmt_p): Check for switch. (get_loop_body_in_if_conv_order): Handle switch. (predicate_bbs): Likewise. (predicate_statements): Likewise. (remove_conditions_and_labels): Likewise. (ifcvt_split_critical_edges): Likewise. (ifcvt_local_dce): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-switch-ifcvt-1.c: New test. * gcc.dg/vect/vect-switch-ifcvt-2.c: New test. * gcc.dg/vect/vect-switch-search-line-fast.c: New test. * gcc.dg/vect/vect-bitfield-read-1-not.c: Change to xfail.
2024-08-25Write CodeView information about static locals in optimized codeMark Harmstone1-0/+57
Write CodeView S_LDATA32 symbols for static locals in optimized code. We have to handle these separately, as they come after the S_FRAMEPROC, plus you can't have S_BLOCK32 symbols like you can in unoptimized code. gcc/ * dwarf2codeview.cc (write_optimized_static_local_vars): New function. (write_function): Call write_optimized_static_local_vars.
2024-08-25Write CodeView S_FRAMEPROC symbolsMark Harmstone1-2/+78
Write S_FRAMEPROC symbols, which aren't very useful but seem to be necessary for Microsoft debuggers to function properly. These symbols come after S_LOCAL symbols for optimized variables, but before S_REGISTER and S_REGREL32 for unoptimized variables. gcc/ * dwarf2codeview.cc (enum cv_sym_type): Add S_FRAMEPROC. (write_s_frameproc): New function. (write_function): Call write_s_frameproc.
2024-08-25Write CodeView information about optimized stack variablesMark Harmstone1-9/+119
Outputs S_DEFRANGE_REGISTER_REL symbols for optimized local variables that are on the stack, consisting of the stack register, the offset, and the code range for which this applies. gcc/ * dwarf2codeview.cc (enum cv_sym_type): Add S_DEFRANGE_REGISTER_REL. (write_defrange_register_rel): New function. (write_optimized_local_variable_loc): Add fbloc param, and call write_defrange_register_rel. (write_optimized_local_variable): Add fbloc param. (write_optimized_function_vars): Add fbloc param.
2024-08-25Write CodeView information about enregistered optimized variablesMark Harmstone4-39/+353
Enable variable tracking when outputting CodeView debug information, and make it so that we issue debug symbols for optimized variables in registers. This consists of S_LOCAL symbols, which give the name and the type of local variables, followed by S_DEFRANGE_REGISTER symbols for the register and the code for which this applies. gcc/ * dwarf2codeview.cc (enum cv_sym_type): Add S_LOCAL and S_DEFRANGE_REGISTER. (write_s_local): New function. (write_defrange_register): New function. (write_optimized_local_variable_loc): New function. (write_optimized_local_variable): New function. (write_optimized_function_vars): New function. (write_function): Call write_optimized_function_vars if variable tracking enabled. * dwarf2out.cc (typedef var_loc_view): Move to dwarf2out.h. (struct dw_loc_list_struct): Likewise. * dwarf2out.h (typedef var_loc_view): Move from dwarf2out.h. (struct dw_loc_list_struct): Likewise. * opts.cc (finish_options): Enable variable tracking for CodeView.
2024-08-25i386: Update STV's gains for TImode arithmetic right shifts on AVX2.Roger Sayle1-8/+13
This patch tweaks timode_scalar_chain::compute_convert_gain to better reflect the expansion of V1TImode arithmetic right shifts by the i386 backend. The comment "see ix86_expand_v1ti_ashiftrt" appears after "case ASHIFTRT" in compute_convert_gain, and the changes below attempt to better match the logic used there. The original motivating example is: __int128 m1; void foo() { m1 = (m1 << 8) >> 8; } which with -O2 -mavx2 we fail to convert to vector form due to the inappropriate cost of the arithmetic right shift. Instruction gain -16 for 7: {r103:TI=r101:TI>>0x8;clobber flags:CC;} Total gain: -3 Chain #1 conversion is not profitable This is reporting that the ASHIFTRT is four instructions worse using vectors than in scalar form, which is incorrect as the AVX2 expansion of this shift only requires three instructions (and the scalar form requires two). With more accurate costs in timode_scalar_chain::compute_convert_gain we now see (with -O2 -mavx2): Instruction gain -4 for 7: {r103:TI=r101:TI>>0x8;clobber flags:CC;} Total gain: 9 Converting chain #1... which results in: foo: vmovdqa m1(%rip), %xmm0 vpslldq $1, %xmm0, %xmm0 vpsrad $8, %xmm0, %xmm1 vpsrldq $1, %xmm0, %xmm0 vpblendd $7, %xmm0, %xmm1, %xmm0 vmovdqa %xmm0, m1(%rip) ret 2024-08-25 Roger Sayle <roger@nextmovesoftware.com> Uros Bizjak <ubizjak@gmail.com> gcc/ChangeLog * config/i386/i386-features.cc (compute_convert_gain) <case ASHIFTRT>: Update to match ix86_expand_v1ti_ashiftrt.
2024-08-25Disable late-combine in another RISC-V testJeff Law1-1/+1
Another test where the output was slightly twiddled by late-combine in which simply disabling late-combine seems to be the best option. > Running /home/jlaw/test/gcc/gcc/testsuite/gcc.target/riscv/riscv.exp ... > FAIL: gcc.target/riscv/cm_mv_rv32.c -Os check-function-bodies sum Pushing to the trunk. gcc/testsuite * gcc.target/riscv/cm_mv_rv32.c: Disable late-combine.
2024-08-25[committed] Fix assembly scan for RISC-V VLS testsJeff Law7-7/+7
Surya's IRA patch from June slightly improves the code we generate for the vls/calling-conventions tests on RISC-V. Specifically it removes an unnecessary move from the instruction stream. This (of course) broke those tests: > Running /home/jlaw/test/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp ... > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 This patch does the natural adjustment of those tests by dropping the moves from the scan. gcc/testsuite * gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: Update expected output. * gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c: Likewise.
2024-08-25Turn off late-combine for a few risc-v specific testsJeff Law4-4/+4
Just minor testsuite adjustments -- several of the shorten-memref tests are slightly twiddled by the late-combine pass: > Running /home/jlaw/test/gcc/gcc/testsuite/gcc.target/riscv/riscv.exp ... > FAIL: gcc.target/riscv/shorten-memrefs-2.c -Os scan-assembler store1a:\n(\t?\\.[^\n]*\n)*\taddi > XPASS: gcc.target/riscv/shorten-memrefs-3.c -Os scan-assembler-not load2a:\n.*addi[ \t]*[at][0-9],[at][0-9],[0-9]* > FAIL: gcc.target/riscv/shorten-memrefs-5.c -Os scan-assembler store1a:\n(\t?\\.[^\n]*\n)*\taddi > FAIL: gcc.target/riscv/shorten-memrefs-8.c -Os scan-assembler store:\n(\t?\\.[^\n]*\n)*\taddi\ta[0-7],a[0-7],1 This patch just turns off the late-combine pass for those tests. Locally I'd adjusted all the shorten-memref patches, but a quick re-rest shows that only 4 tests seem affected right now. Anyway, pushing to the trunk to slightly clean up our test results. gcc/testsuite * gcc.target/riscv/shorten-memrefs-2.c: Turn off late-combine. * gcc.target/riscv/shorten-memrefs-3.c: Likewise. * gcc.target/riscv/shorten-memrefs-5.c: Likewise. * gcc.target/riscv/shorten-memrefs-8.c: Likewise.
2024-08-25modula2 testsuite: new libc unit testGaius Mulley2-0/+75
This patch provides a simple unit test for snprintf and atof against the libc definition module. gcc/testsuite/ChangeLog: * gm2/calling-c/libc/run/pass/calling-c-libc-run-pass.exp: New test. * gm2/calling-c/libc/run/pass/testlibcstr.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-08-25Daily bump.GCC Administrator4-1/+178
2024-08-24modula2: Export all string to integral and fp number conversion functionsGaius Mulley1-0/+84
Export all string to integral and floating point number conversion functions (atof, atoi, atol, atoll, strtod, strtof, strtold, strtol, strtoll, strtoul and strtoull). gcc/m2/ChangeLog: * gm2-libs/libc.def (atof): Export unqualified. (atoi): Ditto. (atol): Ditto. (atoll): Ditto. (strtod): Ditto. (strtof): Ditto. (strtold): Ditto. (strtol): Ditto. (strtoll): Ditto. (strtoul): Ditto. (strtoull): Ditto. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>