riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-08-27	MIPS: Include missing mips16.S in libgcc/lib1funcs.S	YunQiang Su	1	-1/+1
	mips16.S was missing since commit 29b74545531f6afbee9fc38c267524326dbfbedf Date: Thu Jun 1 10:14:24 2023 +0800 MIPS: Add speculation_barrier support Without mips16.S included, some symbols will miss for mips16, and so some software will fail to build. libgcc/ChangeLog: * config/mips/lib1funcs.S: Includes mips16.S.
2024-08-27	combine.cc (make_more_copies): Copy attributes from the original pseudo, ↵	Hans-Peter Nilsson	1	-0/+6
	PR115883 The first of the late-combine passes, propagates some of the copies made during the (in-time-)combine pass in make_more_copies into the users of the "original" pseudo registers and removes the "old" pseudos. That effectively removes attributes such as REG_POINTER, which matter to LRA. The quoted PR is for an ICE-manifesting bug that was exposed by the late-combine pass and went back to hiding with this patch until commit r15-2937-g3673b7054ec2, the fix for PR116236, when it was actually fixed. To wit, this patch is only incidentally related to that bug. In other words, the REG_POINTER attribute should not be required for LRA to work correctly. This patch merely corrects state for those propagated register-uses to ante late-combine. For reasons not investigated, this fixes a failing test "FAIL: gcc.dg/guality/pr54200.c -Og -DPREVENT_OPTIMIZATION line 20 z == 3" for x86_64-linux-gnu. PR middle-end/115883 * combine.cc (make_more_copies): Copy attributes from the original pseudo to the new copy.
2024-08-27	c++/coros: do not assume coros don't nest [PR113457]	Arsen Arsenović	3	-6/+216
	In the testcase presented in the PR, during template expansion, an tsubst of an operand causes a lambda coroutine to be processed, causing it to get an initial suspend and final suspend. The code for assigning awaitable var names (get_awaitable_var) assumed that the sequence Is -> Is -> Fs -> Fs is impossible (i.e. that one could only 'open' one coroutine before closing it at a time), and reset the counter used for unique numbering each time a final suspend occured. This assumption is false in a few cases, usually when lambdas are involved. Instead of storing this counter in a static-storage variable, we can store it in coroutine_info. This struct is local to each function, so we don't need to worry about "cross-contamination" nor resetting. PR c++/113457 gcc/cp/ChangeLog: * coroutines.cc (struct coroutine_info): Add integer field awaitable_number. This is a counter used for assigning unique names to awaitable temporaries. (get_awaitable_var): Use awaitable_number from coroutine_info instead of the static int awn. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr113457-1.C: New test. * g++.dg/coroutines/pr113457.C: New test.
2024-08-26	coroutines: diagnose usage of alloca in coroutines	Arsen Arsenović	2	-0/+33
	We do not support it currently, and the resulting memory can only be used inside a single resumption, so best not confuse the user with it. PR c++/115858 - Incompatibility of coroutines and alloca() gcc/ChangeLog: * coroutine-passes.cc (execute_early_expand_coro_ifns): Emit a sorry if a statement is an alloca call. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr115858.C: New test.
2024-08-26	diagnostics: move output formats from diagnostic.{c,h} to their own files	David Malcolm	12	-257/+359
	In particular, move the classic text output code to a diagnostic-text.cc (analogous to -json.cc and -sarif.cc). No functional change intended. gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add diagnostic-format-text.o. * diagnostic-format-json.cc: Include "diagnostic-format.h". * diagnostic-format-sarif.cc: Likewise. * diagnostic-format-text.cc: New file, using material from diagnostics.cc. * diagnostic-global-context.cc: Include "diagnostic-format.h". * diagnostic-format-text.h: New file, using material from diagnostics.h. * diagnostic-format.h: New file, using material from diagnostics.h. * diagnostic.cc: Include "diagnostic-format.h" and "diagnostic-format-text.h". (diagnostic_text_output_format::~diagnostic_text_output_format): Move to diagnostic-format-text.cc. (diagnostic_text_output_format::on_report_diagnostic): Likewise. (diagnostic_text_output_format::on_diagram): Likewise. (diagnostic_text_output_format::print_any_cwe): Likewise. (diagnostic_text_output_format::print_any_rules): Likewise. (diagnostic_text_output_format::print_option_information): Likewise. * diagnostic.h (class diagnostic_output_format): Move to diagnostic-format.h. (class diagnostic_text_output_format): Move to diagnostic-format-text.h. (diagnostic_output_format_init): Move to diagnostic-format.h. (diagnostic_output_format_init_json_stderr): Likewise. (diagnostic_output_format_init_json_file): Likewise. (diagnostic_output_format_init_sarif_stderr): Likewise. (diagnostic_output_format_init_sarif_file): Likewise. (diagnostic_output_format_init_sarif_stream): Likewise. * gcc.cc: Include "diagnostic-format.h". * opts.cc: Include "diagnostic-format.h". gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic_group_plugin.c: Include "diagnostic-format-text.h". Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26	diagnostics: consolidate on_{begin,end}_diagnostic into on_report_diagnostic	David Malcolm	4	-176/+171
	Previously diagnostic_context::report_diagnostic had, after the call to pp_format (phases 1 and 2 of formatting the message): m_output_format->on_begin_diagnostic (diagnostic); pp_output_formatted_text (this->printer, m_urlifier); if (m_show_cwe) print_any_cwe (diagnostic); if (m_show_rules) print_any_rules (diagnostic); if (m_show_option_requested) print_option_information (diagnostic, orig_diag_kind); m_output_format->on_end_diagnostic (diagnostic, orig_diag_kind); This patch replaces all of the above with a single call to m_output_format->on_report_diagnostic (diagnostic, orig_diag_kind); moving responsibility for phase 3 of formatting and printing the result from diagnostic_context to the output format. This simplifies diagnostic_context::report_diagnostic and allows us to move the code that prints CWEs, rules, and option information in textual form from diagnostic_context to diagnostic_text_output_format, where it belongs. No functional change intended. gcc/ChangeLog: * diagnostic-format-json.cc (json_output_format::on_begin_diagnostic): Delete. (json_output_format::on_end_diagnostic): Rename to... (json_output_format::on_report_diagnostic): ...this and add call to pp_output_formatted_text. (diagnostic_output_format_init_json): Drop unnecessary calls to disable textual printing of CWEs, rules, and options. * diagnostic-format-sarif.cc (sarif_builder::end_diagnostic): Rename to... (sarif_builder::on_report_diagnostic): ...this and add call to pp_output_formatted_text. (sarif_output_format::on_begin_diagnostic): Delete. (sarif_output_format::on_end_diagnostic): Rename to... (sarif_output_format::on_report_diagnostic): ...this and update call to m_builder accordingly. (diagnostic_output_format_init_sarif): Drop unnecessary calls to disable textual printing of CWEs, rules, and options. * diagnostic.cc (diagnostic_context::print_any_cwe): Convert to... (diagnostic_text_output_format::print_any_cwe): ...this. (diagnostic_context::print_any_rules): Convert to... (diagnostic_text_output_format::print_any_rules): ...this. (diagnostic_context::print_option_information): Convert to... (diagnostic_text_output_format::print_option_information): ...this. (diagnostic_context::report_diagnostic): Replace calls to the output format's on_begin_diagnostic, to pp_output_formatted_text, printing CWE, rules, option info, and the call to the format's on_end_diagnostic with a call to the format's on_report_diagnostic. (diagnostic_text_output_format::on_begin_diagnostic): Delete. (diagnostic_text_output_format::on_end_diagnostic): Delete. (diagnostic_text_output_format::on_report_diagnostic): New vfunc, which effectively does the on_begin_diagnostic, the call to pp_output_formatted_text, the calls for printing CWE, rules, option info, and the call to the diagnostic_finalizer. * diagnostic.h (diagnostic_output_format::on_begin_diagnostic): Delete. (diagnostic_output_format::on_end_diagnostic): Delete. (diagnostic_output_format::on_report_diagnostic): New. (diagnostic_text_output_format::on_begin_diagnostic): Delete. (diagnostic_text_output_format::on_end_diagnostic): Delete. (diagnostic_text_output_format::on_report_diagnostic): New. (class diagnostic_context): Add friend class diagnostic_text_output_format. (diagnostic_context::get_urlifier): New accessor. (diagnostic_context::print_any_cwe): Move decl... (diagnostic_text_output_format::print_any_cwe): ...to here. (diagnostic_context::print_any_rules): Move decl... (diagnostic_text_output_format::print_any_rules): ...to here. (diagnostic_context::print_option_information): Move decl... (diagnostic_text_output_format::print_option_information): ...to here. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26	testsuite: add event IDs to multithreaded event plugin test	David Malcolm	4	-20/+38
	Add test coverage of "%@" in event messages in a multithreaded execution path. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-paths-multithreaded-inline-events.c: Update expected output. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py: Likewise. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-separate-events.c: Likewise. * gcc.dg/plugin/diagnostic_plugin_test_paths.c (test_diagnostic_path::add_event_2): Return the id of the added event. (test_diagnostic_path::add_event_2_with_event_id): New. (example_4): Add event IDs to the deadlock messages indicating where the locks where acquired. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26	testsuite: generalize support for Python tests for SARIF output	David Malcolm	8	-16/+208
	In r15-2354-g4d1f71d49e396c I added the ability to use Python to write tests of SARIF output via a new "run-sarif-pytest" based on "run-gcov-pytest", with a sarif.py support script in testsuite/gcc.dg/sarif-output. This followup patch: (a) removes the limitation of such tests needing to be in testsuite/gcc.dg/sarif-output by moving sarif.py to testsuite/lib and adding logic to add that directory to PYTHONPATH when invoking pytest. (b) uses this to replace fragile regexp-based tests in gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c with Python logic that verifies the structure within the generated JSON, and to add test coverage for SARIF output relating to GCC plugins. gcc/ChangeLog: * diagnostic-format-sarif.cc: Add comments noting that we don't yet capture any diagnostic_metadata::rules associated with a diagnostic. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-metadata-sarif.c: New test, based on diagnostic-test-metadata.c. * gcc.dg/plugin/diagnostic-test-metadata-sarif.py: New script. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c: Replace scan-sarif-file directives with run-sarif-pytest, to run... * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py: ...this new test. * gcc.dg/plugin/plugin.exp (plugin_test_list): Add diagnostic-test-metadata-sarif.c. * gcc.dg/sarif-output/sarif.py: Move to... * lib/sarif.py: ...here. * lib/scansarif.exp (run-sarif-pytest): Prepend "lib" to PYTHONPATH before running python scripts. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26	pretty-print: fixes to selftests	David Malcolm	1	-4/+35
	Add selftest coverage for %{ and %} in pretty-print.cc No functional change intended. gcc/ChangeLog: * pretty-print.cc (selftest::test_urls): Make static. (selftest::test_urls_from_braces): New. (selftest::test_null_urls): Make static. (selftest::test_urlification): Likewise. (selftest::pretty_print_cc_tests): Call test_urls_from_braces. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26	json.h: fix typo in comment	David Malcolm	1	-1/+1
	gcc/ChangeLog: * json.h: Fix typo in comment about missing INCLUDE_MEMORY. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26	c++: Check template parameters in member class template specialization ↵	Simon Martin	3	-0/+40
	[PR115716] We currently ICE upon the following invalid code, because we don't check that the template parameters in a member class template specialization are correct. === cut here === template <typename T> struct x { template <typename U> struct y { typedef T result2; }; }; template<> template<typename U, typename> struct x<int>::y { typedef double result2; }; int main() { x<int>::y<int>::result2 xxx2; } === cut here === This patch fixes the PR by calling redeclare_class_template. PR c++/115716 gcc/cp/ChangeLog: * pt.cc (maybe_process_partial_specialization): Call redeclare_class_template. gcc/testsuite/ChangeLog: * g++.dg/template/spec42.C: New test. * g++.dg/template/spec43.C: New test.
2024-08-26	Remove an unneeded include that was added by mistake.	Andi Kleen	1	-1/+0
	gcc/ChangeLog: * tree-if-conv.cc: Remove unneeded include from last change.
2024-08-26	Fix bootstap-errors due to enabling -gvariable-location-views	Bernd Edlinger	3	-3/+3
	This recent change triggered various bootstap-errors, mostly on x86 targets because line info advance address entries were output in the wrong section table. The switch to the wrong line table happened in dwarfout_set_ignored_loc. It must use the same section as the earlier called dwarf2out_switch_text_section. But also ft32-elf was affected, because the assembler choked on something simple as ".2byte .LM2-.LM1", but fortunately it is able to use native location views, the configure test was just not executed because the ft32 "nop" instruction was missing. gcc/ChangeLog: PR debug/116470 * configure.ac: Add the "nop" instruction for cpu type ft32. * configure: Regenerate. * dwarf2out.cc (dwarf2out_set_ignored_loc): Use the correct line info section.
2024-08-26	libcpp: deduplicate definition of padding size	Alexander Monakov	4	-13/+11
	Tie together the two functions that ensure tail padding with search_line_ssse3 via CPP_BUFFER_PADDING macro. libcpp/ChangeLog: * internal.h (CPP_BUFFER_PADDING): New macro; use it ... * charset.cc (_cpp_convert_input): ...here, and ... * files.cc (read_file_guts): ...here, and ... * lex.cc (search_line_ssse3): here.
2024-08-26	tree-optimization/116460 - improve forwprop compile-time	Richard Biener	1	-6/+7
	The following improves forwprop block reachability which I noticed when debugging PR116460 and what is also noted in the comment. It avoids processing blocks in natural loops determined unreachable, thereby making the issue in PR116460 latent. PR tree-optimization/116460 * tree-ssa-forwprop.cc (pass_forwprop::execute): Do not process blocks in unreachable natural loops.
2024-08-26	Delay edge removal in forwprop	Richard Biener	1	-9/+25
	SSA forwprop has switch simplification code that calls remove edge and as side-effect releases dominator info. For a followup we want to retain that so the following delays removing edges until the end of the pass. As usual we have to deal with parts of the edge vanishing due to EH/abnormal pruning so record edges as basic-block index pairs and remove them only when they are still there. * tree-ssa-forwprop.cc (simplify_gimple_switch_label_vec): Delay removing edges and releasing dominator info, instead record into edges_to_remove vector. (simplify_gimple_switch): Pass through vector of to remove edges. (pass_forwprop::execute): Likewise. Remove queued edges.
2024-08-26	vect: Fix STMT_VINFO_DEF_TYPE check for odd/even widen mult [PR116348]	Xi Ruoyao	2	-2/+15
	After fixing PR116142 some code started to trigger an ICE with -O3 -march=znver4. Per Richard Biener who actually made this fix: "supportable_widening_operation fails at transform time - that's likely because vectorizable_reduction "puns" defs to internal_def" so the check should use STMT_VINFO_REDUC_DEF instead of checking if STMT_VINFO_DEF_TYPE is vect_reduction_def. gcc/ChangeLog: PR tree-optimization/116348 * tree-vect-stmts.cc (supportable_widening_operation): Use STMT_VINFO_REDUC_DEF (x) instead of STMT_VINFO_DEF_TYPE (x) == vect_reduction_def. gcc/testsuite/ChangeLog: PR tree-optimization/116348 * gcc.c-torture/compile/pr116438.c: New test. Co-authored-by: Richard Biener <rguenther@suse.de>
2024-08-26	Match: Add int type fits check for .SAT_ADD imm operand	Pan Li	58	-9/+443
	This patch would like to add strict check for imm operand of .SAT_ADD matching. We have no type checking for imm operand in previous, which may result in unexpected IL to be catched by .SAT_ADD pattern. We leverage the int_fits_type_p here to make sure the imm operand is a int type fits the result type of the .SAT_ADD. For example: Fits uint8_t: uint8_t a; uint8_t sum = .SAT_ADD (a, 12); uint8_t sum = .SAT_ADD (a, 12u); uint8_t sum = .SAT_ADD (a, 126u); uint8_t sum = .SAT_ADD (a, 128u); uint8_t sum = .SAT_ADD (a, 228); uint8_t sum = .SAT_ADD (a, 223u); Not fits uint8_t: uint8_t a; uint8_t sum = .SAT_ADD (a, -1); uint8_t sum = .SAT_ADD (a, 256u); uint8_t sum = .SAT_ADD (a, 257); The below test suite are passed for this patch: * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add int_fits_type_p check for .SAT_ADD imm operand. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_u_add_imm-11.c: Adjust test case for imm. * gcc.target/riscv/sat_u_add_imm-12.c: Ditto. * gcc.target/riscv/sat_u_add_imm-15.c: Ditto. * gcc.target/riscv/sat_u_add_imm-16.c: Ditto. * gcc.target/riscv/sat_u_add_imm_type_check-1.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-10.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-11.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-12.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-13.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-14.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-15.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-16.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-17.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-18.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-19.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-2.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-20.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-21.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-22.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-23.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-24.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-25.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-26.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-27.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-28.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-29.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-3.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-30.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-31.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-32.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-33.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-34.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-35.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-36.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-37.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-38.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-39.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-4.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-40.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-41.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-42.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-43.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-44.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-45.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-46.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-47.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-48.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-49.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-5.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-50.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-51.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-52.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-6.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-7.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-8.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-9.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26	expand: Use the correct mode for store flags for popcount [PR116480]	Andrew Pinski	3	-1/+18
	When expanding popcount used for equal to 1 (or rather __builtin_stdc_has_single_bit), the wrong mode was bsing used for the mode of the store flags. We were using the mode of the argument to popcount but since popcount's return value is always int, the mode of the expansion here should have been the mode of the return type rater than the argument. Built and tested on aarch64-linux-gnu with no regressions. Also bootstrapped and tested on x86_64-linux-gnu. PR middle-end/116480 gcc/ChangeLog: * internal-fn.cc (expand_POPCOUNT): Use the correct mode for store flags. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr116480-1.c: New test. * gcc.dg/torture/pr116480-2.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-08-26	i386: Add bf8 -> fp16 intrin	Haochen Jiang	4	-5/+109
	Since BF8 and FP16 have same bits for exponent, the type conversion between them is just a cast for fraction part. We will use a sequence of instrctions instead of new instructions to do that. For convenience, intrins are also provided. gcc/ChangeLog: * config/i386/avx10_2-512convertintrin.h (_mm512_cvtpbf8_ph): New. (_mm512_mask_cvtpbf8_ph): Ditto. (_mm512_maskz_cvtpbf8_ph): Ditto. * config/i386/avx10_2convertintrin.h (_mm_cvtpbf8_ph): Ditto. (_mm_mask_cvtpbf8_ph): Ditto. (_mm_maskz_cvtpbf8_ph): Ditto. (_mm256_cvtpbf8_ph): Ditto. (_mm256_mask_cvtpbf8_ph): Ditto. (_mm256_maskz_cvtpbf8_ph): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-convert-1.c: Add tests for new intrin. * gcc.target/i386/avx10_2-convert-1.c: Ditto.
2024-08-26	AVX10.2: Support compare instructions	Zhang, Jun	4	-27/+183
	gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_ssecom_setcc): Mention behavior change on flags. (ix86_expand_sse_comi): Handle AVX10.2 behavior. (ix86_expand_sse_comi_round): Ditto. (ix86_expand_round_builtin): Ditto. (ix86_expand_builtin): Change function call. * config/i386/i386.md (UNSPEC_COMX): New unspec. * config/i386/sse.md (avx10_2_v<unord>comx<ssemodesuffix><round_saeonly_name>): New. (<sse>_<unord>comi<round_saeonly_name>): Add HFmode. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-compare-1.c: New test. Co-authored-by: Haochen Jiang <haochen.jiang@intel.com> Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
2024-08-26	AVX10.2: Support vector copy instructions	Zhang, Jun	9	-53/+356
	gcc/ChangeLog: * config.gcc: Add avx10_2copyintrin.h. * config/i386/i386.md (avx10_2): New isa attribute. * config/i386/immintrin.h: Include avx10_2copyintrin.h. * config/i386/sse.md (sse_movss_<mode>): Add new constraints to handle AVX10.2. (vec_set<mode>_0): Ditto. (@vec_set<mode>_0): Ditto. (vec_set<mode>_0): Ditto. (avx512fp16_mov<mode>): Ditto. (vec_set<mode>_0_1): New split. config/i386/avx10_2copyintrin.h: New file. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-vmovd-1.c: New test. * gcc.target/i386/avx10_2-vmovd-2.c: Ditto. * gcc.target/i386/avx10_2-vmovw-1.c: Ditto. * gcc.target/i386/avx10_2-vmovw-2.c: Ditto.
2024-08-26	AVX10.2: Support minmax instructions	Mo, Zewei	28	-1/+2555
	gcc/ChangeLog: * config.gcc: Add avx10_2-512minmaxintrin.h and avx10_2minmaxintrin.h. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (V8BF, V8BF, V8BF, INT, V8BF, UQI), (V16BF, V16BF, V16BF, INT, V16BF, UHI), (V32BF, V32BF, V32BF, INT, V32BF, USI), (V8HF, V8HF, V8HF, INT, V8HF, UQI), (V8DF, V8DF, V8DF, INT, V8DF, UQI, INT), (V32HF, V32HF, V32HF, INT, V32HF, USI, INT), (V16HF, V16HF, V16HF, INT, V16HF, UHI, INT), (V16SF, V16SF, V16SF, INT, V16SF, UHI, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle V8BF_FTYPE_V8BF_V8BF_INT_V8BF_UQI, V16BF_FTYPE_V16BF_V16BF_INT_V16BF_UHI, V32BF_FTYPE_V32BF_V32BF_INT_V32BF_USI, V8HF_FTYPE_V8HF_V8HF_INT_V8HF_UQI, (ix86_expand_round_builtin): Handle V8DF_FTYPE_V8DF_V8DF_INT_V8DF_UQI_INT, V32HF_FTYPE_V32HF_V32HF_INT_V32HF_USI_INT, V16HF_FTYPE_V16HF_V16HF_INT_V16HF_UHI_INT. V16SF_FTYPE_V16SF_V16SF_INT_V16SF_UHI_INT. * config/i386/immintrin.h: Include avx10_2-512mixmaxintrin.h and avx10_2minmaxintrin.h. * config/i386/sse.md (VFH_AVX10_2): New. (avx10_2_vminmaxnepbf16_<mode><mask_name>): New define_insn. (avx10_2_minmaxp<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_minmaxs<mode><mask_scalar_name><round_saeonly_scalar_name>): Ditto. * config/i386/avx10_2-512minmaxintrin.h: New file. * config/i386/avx10_2minmaxintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx512f-helper.h: Add helper function. * gcc.target/i386/avx10-minmax-helper.h: New helper file. * gcc.target/i386/avx10_2-512-minmax-1.c: New test. * gcc.target/i386/avx10_2-512-vminmaxnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxps-2.c: Ditto. * gcc.target/i386/avx10_2-minmax-1.c: Ditto. * gcc.target/i386/avx10_2-vminmaxnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsh-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxss-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto. Co-authored-by: Hu, Lin1 <lin1.hu@intel.com> Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
2024-08-26	[PATCH 2/2] AVX10.2: Support saturating convert instructions	Hu, Lin1	31	-1/+2830
	gcc/ChangeLog: * config/i386/avx10_2-512satcvtintrin.h: Add new intrin. * config/i386/avx10_2satcvtintrin.h: Ditto. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md (VF1_VF2_AVX10_2): New iterator. (VF2_AVX10_2): Ditto. (VI8_AVX10_2): Ditto. (sat_cvt_sign_prefix): Add new UNSPEC. (UNSPEC_SAT_CVT_DS_SIGN_ITER): New iterator. (pd2dqssuff): Ditto. (avx10_2_vcvtt<castmode>2<sat_cvt_sign_prefix>dqs<mode><mask_name><round_saeonly_name>): New. (avx10_2_vcvttpd2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_vcvttps2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_vcvttsd2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>): Ditto. (avx10_2_vcvttss2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-satcvt-1.c: Add test. * gcc.target/i386/avx10_2-512-satcvt-1.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2dqs-2.c: New test. * gcc.target/i386/avx10_2-512-vcvttpd2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2usis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2usis-2.c: Ditto.
2024-08-26	[PATCH 1/2] AVX10.2: Support saturating convert instructions	Hu, Lin1	40	-1/+3327
	gcc/ChangeLog: * config.gcc: Add avx10_2satcvtintrin.h and avx10_2-512satcvtintrin.h. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (V8HI, V8BF, V8HI, UQI), (V16HI, V16BF, V16HI, UHI), (V32HI, V32BF, V32HI, USI), (V16SI, V16SF, V16SI, UHI, INT), (V16HI, V16BF, V16HI, UHI, INT), (V32HI, V32BF, V32HI, USI, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle V32HI_FTYPE_V32BF_V32HI_USI, V16HI_FTYPE_V16BF_V16HI_UHI, V8HI_FTYPE_V8BF_V8HI_UQI. (ix86_expand_round_builtin): Handle V32HI_FTYPE_V32BF_V32HI_USI_INT, V16SI_FTYPE_V16SF_V16SI_UHI_INT, V16HI_FTYPE_V16BF_V16HI_UHI_INT. * config/i386/immintrin.h: Include avx10_2satcvtintrin.h and avx10_2-512savcvtintrin.h. * config/i386/sse.md: (UNSPEC_CVTNE_BF16_IBS_ITER): New iterator. (sat_cvt_sign_prefix): Ditto. (sat_cvt_trunc_prefix): Ditto. (UNSPEC_CVT_PH_IBS_ITER): Ditto. (UNSPEC_CVTT_PH_IBS_ITER): Ditto. (UNSPEC_CVT_PS_IBS_ITER): Ditto. (UNSPEC_CVTT_PS_IBS_ITER): Ditto. (avx10_2_cvt<sat_cvt_trunc_prefix>nebf162i<sat_cvt_sign_prefix>bs<mode><mask_name>): New define_insn. (avx10_2_cvtph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>): Ditto. (avx10_2_cvttph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_cvtps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>): Ditto. (avx10_2_cvttps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>): Ditto. * config/i386/avx10_2-512satcvtintrin.h: New file. * config/i386/avx10_2satcvtintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx512f-helper.h: Add new test macro. * gcc.target/i386/m512-check.h: Add new type. * gcc.target/i386/avx10_2-512-satcvt-1.c: New test. * gcc.target/i386/avx10_2-512-vcvtnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtps2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-satcvt-1.c: Ditto. * gcc.target/i386/avx10_2-vcvtnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttnebf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttnebf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2iubs-2.c: Ditto.
2024-08-26	[PATCH 2/2] AVX10.2: Support BF16 instructions	konglin1	35	-2/+2096
	gcc/ChangeLog: * config/i386/avx10_2-512bf16intrin.h: Add new intrinsics. * config/i386/avx10_2bf16intrin.h: Diito. * config/i386/i386-builtin-types.def : Add new DEF_FUNCTION_TYPE for new type. * config/i386/i386-builtin.def (BDESC): Add new buildin. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle new type. * config/i386/sse.md (vecmemsuffix): Add vector BF mode. (avx10_2_rsqrtpbf16_<mode><mask_name>): New define_insn. (avx10_2_sqrtnepbf16_<mode><mask_name>): Ditto. (avx10_2_rcppbf16_<mode><mask_name>): Ditto. (avx10_2_getexppbf16_<mode><mask_name>): Ditto. (BF16IMMOP): New iterator. (bf16immop): Ditto. (avx10_2_<bf16immop>pbf16_<mode><mask_name>): New define_insn. (avx10_2_fpclasspbf16_<mode><mask_scalar_merge_name>): Ditto. (avx10_2_cmppbf16_<mode><mask_scalar_merge_name>): Ditto. (avx10_2_comsbf16_v8bf): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10-check.h: Add AVX10_SCALAR. * gcc.target/i386/avx10-helper.h: Add helper functions. * gcc.target/i386/avx10_2-512-bf16-1.c: Add new tests. * gcc.target/i386/avx10_2-bf16-1.c: Ditto. * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-512-vcmppbf16-2.c: New test. * gcc.target/i386/avx10_2-512-vfpclasspbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vgetexppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vgetmantpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vrcppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vreducenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vrndscalenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vrsqrtpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vsqrtnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vcmppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vcomsbf16-1.c: Ditto. * gcc.target/i386/avx10_2-vcomsbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfpclasspbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vgetexppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vgetmantpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrcppbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vreducenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrndscalenepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrsqrtpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vsqrtnepbf16-2.c: Ditto. Co-authored-by: Levy Hsu <admin@levyhsu.com>
2024-08-26	[PATCH 1/2] AVX10.2: Support BF16 instructions	konglin1	35	-2/+2514
	gcc/ChangeLog: * config.gcc: Add avx10_2-512bf16intrin.h and avx10_2bf16intrin.h. * config/i386/i386-builtin-types.def : Add new DEF_FUNCTION_TYPE for V32BF_FTYPE_V32BF_V32BF, V16BF_FTYPE_V16BF_V16BF, V8BF_FTYPE_V8BF_V8BF, V8BF_FTYPE_V8BF_V8BF_UQI, V16BF_FTYPE_V16BF_V16BF_UHI, V32BF_FTYPE_V32BF_V32BF_USI, V32BF_FTYPE_V32BF_V32BF_V32BF_USI, V8BF_FTYPE_V8BF_V8BF_V8BF_UQI and V16BF_FTYPE_V16BF_V16BF_V16BF_UHI. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle new DEF_FUNCTION_TYPE. * config/i386/immintrin.h: Include avx10_2-512bf16intrin.h and avx10_2bf16intrin.h. * config/i386/sse.md (VBF_AVX10_2): New iterator. (avx10_2_scalefpbf16_<mode><mask_name>): New define_insn. (avx10_2_<code>nepbf16_<mode><mask_name>): Ditto. (avx10_2_<insn>nepbf16_<mode><mask_name>): Ditto. (avx10_2_fmaddnepbf16_<mode>_maskz): New expander. (avx10_2_fnmaddnepbf16_<mode>_maskz): Ditto. (avx10_2_fmsubnepbf16_<mode>_maskz): Ditto. (avx10_2_fnmsubnepbf16_<mode>_maskz): Ditto. (avx10_2_fmaddnepbf16_<mode><sd_maskz_name>): New define_insn. (avx10_2_fmaddnepbf16_<mode>_mask): Ditto. (avx10_2_fmaddnepbf16_<mode>_mask3): Ditto. (avx10_2_fnmaddnepbf16_<mode><sd_maskz_name>): Ditto. (avx10_2_fnmaddnepbf16_<mode>_mask): Ditto. (avx10_2_fnmaddnepbf16_<mode>_mask3): Ditto. (avx10_2_fmsubnepbf16_<mode><sd_maskz_name>): Ditto. (avx10_2_fmsubnepbf16_<mode>_mask): Ditto. (avx10_2_fmsubnepbf16_<mode>_mask3): Ditto. (avx10_2_fnmsubnepbf16_<mode><sd_maskz_name>): Ditto. (avx10_2_fnmsubnepbf16_<mode>_mask): Ditto. (avx10_2_fnmsubnepbf16_<mode>_mask3): Ditto. * config/i386/avx10_2-512bf16intrin.h: New file. * config/i386/avx10_2bf16intrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512f-helper.h: Add MAKE_MASK_MERGE and MAKE_MASK_ZERO for bf16_uw. * gcc.target/i386/m512-check.h: Add union512bf16_uw, union256bf16_uw, union128bf16_uw and CHECK_EXP for them. * gcc.target/i386/avx10-helper.h: New file. * gcc.target/i386/avx10_2-512-bf16-1.c: New test. * gcc.target/i386/avx10_2-512-vaddnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vdivnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfnmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vfnmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vmaxpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vscalefpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vsubnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-bf16-1.c: Ditto. * gcc.target/i386/avx10_2-vaddnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vdivnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfnmaddXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfnmsubXXXnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vmaxpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vmulnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vscalefpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vsubnepbf16-2.c: Ditto. Co-authored-by: Levy Hsu <admin@levyhsu.com>
2024-08-26	AVX10.2: Support convert instructions	Levy Hsu	45	-5/+3511
	gcc/ChangeLog: * config.gcc: Add avx10_2-512convertintrin.h and avx10_2convertintrin.h. * config/i386/i386-builtin-types.def: Add new DEF_POINTER_TYPE and DEF_FUNCTION_TYPE. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle AVX10.2. (ix86_expand_round_builtin): Ditto. * config/i386/immintrin.h: Include avx10_2-512convertintrin.h, avx10_2convertintrin.h. * config/i386/sse.md (VHF_AVX10_2): New iterator. (bf16_ph): Add 512 bit mode. (avx10_2_cvt2ps2phx_<mode><mask_name<round_name>): New define_insn. (ssebvecmode): New iterator. (UNSPEC_NECONVERTFP8_PACK): Ditto. (neconvertfp8_pack): Ditto. (vcvt<neconvertfp8_pack><mode><mask_name>): New define_insn. (ssebvecmode_2): New iterator. (UNSPEC_VCVTBIASPH2FP8_PACK): Ditto. (biasph2fp8_pack): Ditto. (vcvt<biasph2fp8_pack>v8hf): New expander. (vcvt<biasph2fp8_pack>v8hf_mask): Ditto. (vcvt<biasph2bf8_pack>v8hf): New define_insn. (vcvt<biasph2fp8_pack>v8hf_mask): Ditto. (VHF_AVX10_2_2): New iterator. (vcvt<biasph2fp8_pack><mode><mask_name>): New define_insn. (VHF_256_512): New iterator. (ph2fp8suff): Ditto. (UNSPEC_NECONVERTPH2FP8_PACK): Ditto. (neconvertph2fp8): Ditto. (vcvt<neconvertph2fp8>v8hf_mask): New expander. (vcvt<neconvertph2fp8>v8hf): New define_insn. (vcvt<neconvertph2fp8>v8hf_mask): Ditto. (vcvt<neconvertph2fp8><mode><mask_name>): Ditto. (vcvthf82ph<mode><mask_name>): Ditto. * config/i386/avx10_2-512convertintrin.h: New file. * config/i386/avx10_2convertintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros for const. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-512-convert-1.c: New test. * gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvthf82ph-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtne2ph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvtneph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-convert-1.c: Ditto. * gcc.target/i386/avx10_2-vcvt2ps2phx-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvthf82ph-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtne2ph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtneph2hf8s-2.c: Ditto. * gcc.target/i386/fp8-helper.h: New helper file. Co-authored-by: Levy Hsu <admin@levyhsu.com> Co-authored-by: Kong Lingling <lingling.kong@intel.com>
2024-08-26	[PATCH 2/2] AVX10.2: Support media instructions	Haochen Jiang	32	-35/+1953
	gcc/ChangeLog: * config/i386/avx10_2-512mediaintrin.h: Add new intrins. * config/i386/avx10_2mediaintrin.h: Ditto. * config/i386/i386-builtin.def: Add new builtins. * config/i386/i386-builtins.cc (def_builtin): Handle shared builtins between AVXVNNIINT16 and AVX10.2. * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): Ditto. * config/i386/sse.md (unspec): Add UNSPEC_VDPPHPS. (avx10_2_mpsadbw<mask_name>): New define_insn. (<mask_codefor><sse4_1_avx2>_mpsadbw<mask_name>): Ditto. (vpdp<vpdpwprodtype>_<mode>): Add AVX10_2_256. (vpdp<vpdpwprodtype>_v16si): New defin_insn. (vpdp<vpdpwprodtype>_<mode>_mask): Ditto. (vpdp<vpdpwprodtype>_<mode>_maskz): Ditto. (vpdp<vpdpwprodtype>_<mode>_maskz): New expander. (vdpphps_<mode>): New define_insn. (vdpphps_<mode>_mask): Ditto. (vdpphps_<mode>_maskz): Ditto. (vdpphps_<mode>_maskz): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/avxvnniint16-1.c: Add new macro test. * gcc.target/i386/avx-1.c: Ditto. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-512-media-1.c: Add test. * gcc.target/i386/avx10_2-media-1.c: Ditto. * gcc.target/i386/avxvnniint16-builtin.c: New test. * gcc.target/i386/avx10_2-512-vdpphps-2.c: Ditto. * gcc.target/i386/avx10_2-512-vmpsadbw-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwsud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwsuds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwusd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwusds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwuud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpwuuds-2.c: Ditto. * gcc.target/i386/avx10_2-builtin-2.c: Ditto. * gcc.target/i386/avx10_2-vdpphps-2.c: Ditto. * gcc.target/i386/avx10_2-vmpsadbw-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwsud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwsuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwusd-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwusds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwuud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwuuds-2.c: Ditto. Co-authored-by: Hongyu Wang <hongyu.wang@intel.com>
2024-08-26	[PATCH 1/2] AVX10.2: Support media instructions	Hongyu Wang	30	-24/+1577
	gcc/ChangeLog * config.gcc: Add avx10_2mediaintrin.h and avx10_2-512mediaintrin.h. * config/i386/i386-builtin.def: Add new builtins. * config/i386/i386-builtins.cc (def_builtin): Handle shared builtins between AVXVNNIINT8 and AVX10.2. * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): Ditto. * config/i386/immintrin.h: Include avx10_2mediaintrin.h and avx10_2-512mediaintrin.h * config/i386/sse.md: (VI4_AVX10_2): New. (vpdp<vpdotprodtype>_<mode>): Add AVX10_2_256. (vpdp<vpdotprodtype>_v16si): New define_insn. (vpdp<vpdotprodtype>_<mode>_mask): Ditto. (vpdp<vpdotprodtype>_<mode>_maskz): Ditto. (vpdp<vpdotprodtype>_<mode>_maskz): New expander. config/i386/avx10_2-512mediaintrin.h: New file. * config/i386/avx10_2mediaintrin.h: Ditto. gcc/testsuite/ChangeLog * gcc.target/i386/avx512f-helper.h: Reuse AVX512F macros for AVX10. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * lib/target-supports.exp (check_effective_target_avx10_2): New. (check_effective_target_avx10_2_512): Ditto. * gcc.target/i386/avx10-check.h: New test file. * gcc.target/i386/avx10-helper.h: Ditto. * gcc.target/i386/avx10_2-builtin-1.c: Ditto. * gcc.target/i386/avx10_2-512-media-1.c: Ditto. * gcc.target/i386/avx10_2-media-1.c: Ditto.. * gcc.target/i386/avxvnniint8-builtin.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbssd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbssds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbsud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbsuds-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbuud-2.c: Ditto. * gcc.target/i386/avx10_2-512-vpdpbuuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbssd-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbssds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbsud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbsuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbuud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbuuds-2.c: Ditto. Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
2024-08-26	i386: Refactor m512-check.h	Haochen Jiang	1	-31/+35
	After AVX10 introduction, we still want to use AVX512 helper functions to avoid duplicate code. In order to reuse them, we need to do some refactor to make sure each function define happen under correct ISA to avoid ABI warnings. gcc/testsuite/ChangeLog: * gcc.target/i386/m512-check.h: Wrap the function define with correct vector size.
2024-08-26	RISC-V: Support IMM for operand 0 of ussub pattern	Pan Li	17	-2/+477
	This patch would like to allow IMM for the operand 0 of ussub pattern. Aka .SAT_SUB(1023, y) as the below example. Form 1: #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return (T)IMM >= y ? (T)IMM - y : 0; \ } DEF_SAT_U_SUB_IMM_FMT_1(uint64_t, 1023) Before this patch: 10 │ sat_u_sub_imm82_uint64_t_fmt_1: 11 │ li a5,82 12 │ bgtu a0,a5,.L3 13 │ sub a0,a5,a0 14 │ ret 15 │ .L3: 16 │ li a0,0 17 │ ret After this patch: 10 │ sat_u_sub_imm82_uint64_t_fmt_1: 11 │ li a5,82 12 │ sltu a4,a5,a0 13 │ addi a4,a4,-1 14 │ sub a0,a5,a0 15 │ and a0,a4,a0 16 │ ret The below test suites are passed for this patch: 1. The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_gen_unsigned_xmode_reg): Add new func impl to gen xmode rtx reg from operand rtx. (riscv_expand_ussub): Gen xmode reg for operand 1. * config/riscv/riscv.md: Allow const_int for operand 1. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macro. * gcc.target/riscv/sat_u_sub_imm-1.c: New test. * gcc.target/riscv/sat_u_sub_imm-1_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-1_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-2.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-2_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-3.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-3_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-4.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-1.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-2.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-3.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-4.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26	RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 4	Pan Li	13	-0/+236
	This patch would like to add test cases for the unsigned vector .SAT_TRUNC form 4. Aka: Form 4: #define DEF_VEC_SAT_U_TRUNC_FMT_4(NT, WT) \ void __attribute__((noinline)) \ vec_sat_u_trunc_##NT##_##WT##_fmt_4 (NT out, WT in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ bool not_overflow = in[i] <= (WT)(NT)(-1); \ out[i] = ((NT)in[i]) \| (NT)((NT)not_overflow - 1); \ } \ } DEF_VEC_SAT_U_TRUNC_FMT_4 (uint32_t, uint64_t) The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-19.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-20.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-21.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-22.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-23.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-24.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-19.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-20.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-21.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-22.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-23.c: New test. * gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-24.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26	RISC-V: Add testcases for unsigned scalar .SAT_TRUNC form 4	Pan Li	13	-0/+218
	This patch would like to add test cases for the unsigned scalar quad and oct .SAT_TRUNC form 4. Aka: Form 4: #define DEF_SAT_U_TRUNC_FMT_4(NT, WT) \ NT __attribute__((noinline)) \ sat_u_trunc_##WT##_to_##NT##_fmt_4 (WT x) \ { \ bool not_overflow = x <= (WT)(NT)(-1); \ return ((NT)x) \| (NT)((NT)not_overflow - 1); \ } The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_u_trunc-19.c: New test. * gcc.target/riscv/sat_u_trunc-20.c: New test. * gcc.target/riscv/sat_u_trunc-21.c: New test. * gcc.target/riscv/sat_u_trunc-22.c: New test. * gcc.target/riscv/sat_u_trunc-23.c: New test. * gcc.target/riscv/sat_u_trunc-24.c: New test. * gcc.target/riscv/sat_u_trunc-run-19.c: New test. * gcc.target/riscv/sat_u_trunc-run-20.c: New test. * gcc.target/riscv/sat_u_trunc-run-21.c: New test. * gcc.target/riscv/sat_u_trunc-run-22.c: New test. * gcc.target/riscv/sat_u_trunc-run-23.c: New test. * gcc.target/riscv/sat_u_trunc-run-24.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26	Daily bump.	GCC Administrator	3	-1/+143

2024-08-25	RISC-V: Fix double mode under RV32 not utilize vf	demin.han	33	-68/+69
	Currently, some binops of vector vs double scalar under RV32 can't translated to vf but vfmv+vxx.vv. The cause is that vec_duplicate is also expanded to broadcast for double mode under RV32. last-combine can't process expanded broadcast. gcc/ChangeLog: * config/riscv/vector.md: Add !FLOAT_MODE_P constraint. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto. * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto.
2024-08-25	[PATCH] Re-add calling emit_clobber in lower-subreg.cc's resolve_simple_move.	Xianmiao Qu	2	-0/+19
	The previous patch: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d8a6945c6ea22efa4d5e42fe1922d2b27953c8cd aimed to eliminate redundant MOV instructions by removing calling emit_clobber in lower-subreg.cc's resolve_simple_move. First, I found that another patch address this issue: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=bdf2737cda53a83332db1a1a021653447b05a7e7 and even without removing calling emit_clobber, the instruction generation is still as expected. Second, removing the CLOBBER expression will have side effects. When there is no CLOBBER expression and only SUBREG assignments exist, according to the logic of the 'df_lr_bb_local_compute' function, the register will be added to the basic block LR IN set. This will cause the register's lifetime to span the entire function, resulting in increased register pressure. Taking the newly added test case 'gcc/testsuite/gcc.target/riscv/pr43644.c' as an example, removing the CLOBBER expression will lead to spill in some registers. gcc/: * lower-subreg.cc (resolve_simple_move): Re-add calling emit_clobber immediately before moving a multi-word register by parts. gcc/testsuite/: * gcc.target/riscv/pr43644.c: New test case.
2024-08-25	testsuite: Run array54.C only for sync_int_long targets	Dimitar Dimitrov	1	-0/+1
	The test case uses "atomic<int>", which fails to link on pru-unknown-elf target due to missing __atomic_load_4 symbol. Fix by filtering for sync_int_long effective target. Ensured that the test still passes for x86_64-pc-linux-gnu. gcc/testsuite/ChangeLog: * g++.dg/init/array54.C: Require sync_int_long effective target. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2024-08-25	Support if conversion for switches	Andi Kleen	5	-6/+270
	The gimple-if-to-switch pass converts if statements with multiple equal checks on the same value to a switch. This breaks vectorization which cannot handle switches. Teach the tree-if-conv pass used by the vectorizer to handle simple switch statements, like those created by if-to-switch earlier. These are switches that only have a single non default block, They are handled similar to COND in if conversion. This makes the vect-bitfield-read-1-not test fail. The test checks for a bitfield analysis failing, but it actually relied on the ifcvt erroring out early because the test is using a switch. The if conversion still does not work because the switch is not in a form that this patch can handle, but it fails much later and the bitfield analysis succeeds, which makes the test fail. I marked it xfail because it doesn't seem to be testing what it wants to test. PR tree-optimization/115866 gcc/ChangeLog: * tree-if-conv.cc (if_convertible_switch_p): New function. (if_convertible_stmt_p): Check for switch. (get_loop_body_in_if_conv_order): Handle switch. (predicate_bbs): Likewise. (predicate_statements): Likewise. (remove_conditions_and_labels): Likewise. (ifcvt_split_critical_edges): Likewise. (ifcvt_local_dce): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-switch-ifcvt-1.c: New test. * gcc.dg/vect/vect-switch-ifcvt-2.c: New test. * gcc.dg/vect/vect-switch-search-line-fast.c: New test. * gcc.dg/vect/vect-bitfield-read-1-not.c: Change to xfail.
2024-08-25	Write CodeView information about static locals in optimized code	Mark Harmstone	1	-0/+57
	Write CodeView S_LDATA32 symbols for static locals in optimized code. We have to handle these separately, as they come after the S_FRAMEPROC, plus you can't have S_BLOCK32 symbols like you can in unoptimized code. gcc/ * dwarf2codeview.cc (write_optimized_static_local_vars): New function. (write_function): Call write_optimized_static_local_vars.
2024-08-25	Write CodeView S_FRAMEPROC symbols	Mark Harmstone	1	-2/+78
	Write S_FRAMEPROC symbols, which aren't very useful but seem to be necessary for Microsoft debuggers to function properly. These symbols come after S_LOCAL symbols for optimized variables, but before S_REGISTER and S_REGREL32 for unoptimized variables. gcc/ * dwarf2codeview.cc (enum cv_sym_type): Add S_FRAMEPROC. (write_s_frameproc): New function. (write_function): Call write_s_frameproc.
2024-08-25	Write CodeView information about optimized stack variables	Mark Harmstone	1	-9/+119
	Outputs S_DEFRANGE_REGISTER_REL symbols for optimized local variables that are on the stack, consisting of the stack register, the offset, and the code range for which this applies. gcc/ * dwarf2codeview.cc (enum cv_sym_type): Add S_DEFRANGE_REGISTER_REL. (write_defrange_register_rel): New function. (write_optimized_local_variable_loc): Add fbloc param, and call write_defrange_register_rel. (write_optimized_local_variable): Add fbloc param. (write_optimized_function_vars): Add fbloc param.
2024-08-25	Write CodeView information about enregistered optimized variables	Mark Harmstone	4	-39/+353
	Enable variable tracking when outputting CodeView debug information, and make it so that we issue debug symbols for optimized variables in registers. This consists of S_LOCAL symbols, which give the name and the type of local variables, followed by S_DEFRANGE_REGISTER symbols for the register and the code for which this applies. gcc/ * dwarf2codeview.cc (enum cv_sym_type): Add S_LOCAL and S_DEFRANGE_REGISTER. (write_s_local): New function. (write_defrange_register): New function. (write_optimized_local_variable_loc): New function. (write_optimized_local_variable): New function. (write_optimized_function_vars): New function. (write_function): Call write_optimized_function_vars if variable tracking enabled. * dwarf2out.cc (typedef var_loc_view): Move to dwarf2out.h. (struct dw_loc_list_struct): Likewise. * dwarf2out.h (typedef var_loc_view): Move from dwarf2out.h. (struct dw_loc_list_struct): Likewise. * opts.cc (finish_options): Enable variable tracking for CodeView.
2024-08-25	i386: Update STV's gains for TImode arithmetic right shifts on AVX2.	Roger Sayle	1	-8/+13
	This patch tweaks timode_scalar_chain::compute_convert_gain to better reflect the expansion of V1TImode arithmetic right shifts by the i386 backend. The comment "see ix86_expand_v1ti_ashiftrt" appears after "case ASHIFTRT" in compute_convert_gain, and the changes below attempt to better match the logic used there. The original motivating example is: __int128 m1; void foo() { m1 = (m1 << 8) >> 8; } which with -O2 -mavx2 we fail to convert to vector form due to the inappropriate cost of the arithmetic right shift. Instruction gain -16 for 7: {r103:TI=r101:TI>>0x8;clobber flags:CC;} Total gain: -3 Chain #1 conversion is not profitable This is reporting that the ASHIFTRT is four instructions worse using vectors than in scalar form, which is incorrect as the AVX2 expansion of this shift only requires three instructions (and the scalar form requires two). With more accurate costs in timode_scalar_chain::compute_convert_gain we now see (with -O2 -mavx2): Instruction gain -4 for 7: {r103:TI=r101:TI>>0x8;clobber flags:CC;} Total gain: 9 Converting chain #1... which results in: foo: vmovdqa m1(%rip), %xmm0 vpslldq $1, %xmm0, %xmm0 vpsrad $8, %xmm0, %xmm1 vpsrldq $1, %xmm0, %xmm0 vpblendd $7, %xmm0, %xmm1, %xmm0 vmovdqa %xmm0, m1(%rip) ret 2024-08-25 Roger Sayle <roger@nextmovesoftware.com> Uros Bizjak <ubizjak@gmail.com> gcc/ChangeLog * config/i386/i386-features.cc (compute_convert_gain) <case ASHIFTRT>: Update to match ix86_expand_v1ti_ashiftrt.
2024-08-25	Disable late-combine in another RISC-V test	Jeff Law	1	-1/+1
	Another test where the output was slightly twiddled by late-combine in which simply disabling late-combine seems to be the best option. > Running /home/jlaw/test/gcc/gcc/testsuite/gcc.target/riscv/riscv.exp ... > FAIL: gcc.target/riscv/cm_mv_rv32.c -Os check-function-bodies sum Pushing to the trunk. gcc/testsuite * gcc.target/riscv/cm_mv_rv32.c: Disable late-combine.
2024-08-25	[committed] Fix assembly scan for RISC-V VLS tests	Jeff Law	7	-7/+7
	Surya's IRA patch from June slightly improves the code we generate for the vls/calling-conventions tests on RISC-V. Specifically it removes an unnecessary move from the instruction stream. This (of course) broke those tests: > Running /home/jlaw/test/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp ... > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 > FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c -O3 -ftree-vectorize -mrvv-vector-bits=scalable scan-assembler-times mv\\s+s0,a0\\s+call\\s+memset\\s+mv\\s+a0,s0 3 This patch does the natural adjustment of those tests by dropping the moves from the scan. gcc/testsuite * gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: Update expected output. * gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c: Likewise.
2024-08-25	Turn off late-combine for a few risc-v specific tests	Jeff Law	4	-4/+4
	Just minor testsuite adjustments -- several of the shorten-memref tests are slightly twiddled by the late-combine pass: > Running /home/jlaw/test/gcc/gcc/testsuite/gcc.target/riscv/riscv.exp ... > FAIL: gcc.target/riscv/shorten-memrefs-2.c -Os scan-assembler store1a:\n(\t?\\.[^\n]\n)\taddi > XPASS: gcc.target/riscv/shorten-memrefs-3.c -Os scan-assembler-not load2a:\n.addi[ \t][at][0-9],[at][0-9],[0-9]* > FAIL: gcc.target/riscv/shorten-memrefs-5.c -Os scan-assembler store1a:\n(\t?\\.[^\n]\n)\taddi > FAIL: gcc.target/riscv/shorten-memrefs-8.c -Os scan-assembler store:\n(\t?\\.[^\n]\n)\taddi\ta[0-7],a[0-7],1 This patch just turns off the late-combine pass for those tests. Locally I'd adjusted all the shorten-memref patches, but a quick re-rest shows that only 4 tests seem affected right now. Anyway, pushing to the trunk to slightly clean up our test results. gcc/testsuite * gcc.target/riscv/shorten-memrefs-2.c: Turn off late-combine. * gcc.target/riscv/shorten-memrefs-3.c: Likewise. * gcc.target/riscv/shorten-memrefs-5.c: Likewise. * gcc.target/riscv/shorten-memrefs-8.c: Likewise.
2024-08-25	modula2 testsuite: new libc unit test	Gaius Mulley	2	-0/+75
	This patch provides a simple unit test for snprintf and atof against the libc definition module. gcc/testsuite/ChangeLog: * gm2/calling-c/libc/run/pass/calling-c-libc-run-pass.exp: New test. * gm2/calling-c/libc/run/pass/testlibcstr.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-08-25	Daily bump.	GCC Administrator	4	-1/+178

2024-08-24	modula2: Export all string to integral and fp number conversion functions	Gaius Mulley	1	-0/+84
	Export all string to integral and floating point number conversion functions (atof, atoi, atol, atoll, strtod, strtof, strtold, strtol, strtoll, strtoul and strtoull). gcc/m2/ChangeLog: * gm2-libs/libc.def (atof): Export unqualified. (atoi): Ditto. (atol): Ditto. (atoll): Ditto. (strtod): Ditto. (strtof): Ditto. (strtold): Ditto. (strtol): Ditto. (strtoll): Ditto. (strtoul): Ditto. (strtoull): Ditto. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net>