aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-08-27c++/coroutines: fix actor cases not being added to the current switch [PR109867]Arsen Arsenović2-41/+34
Previously, we were building and inserting case_labels manually, which led to them not being added into the currently running switch via c_add_case_label. This led to false diagnostics that the user could not act on. PR c++/109867 gcc/cp/ChangeLog: * coroutines.cc (expand_one_await_expression): Replace uses of build_case_label with finish_case_label. (build_actor_fn): Ditto. (create_anon_label_with_ctx): Remove now-unused function. gcc/testsuite/ChangeLog: * g++.dg/coroutines/torture/pr109867.C: New test. Reviewed-by: Iain Sandoe <iain@sandoe.co.uk>
2024-08-27m68k: Accept ASHIFT like MULT in address operandAndreas Schwab1-18/+40
When LRA pulls an address operand out of a MEM it caninoicalizes a containing MULT into ASHIFT. Adjust the address decomposer to recognize this form. PR target/116413 * config/m68k/m68k.cc (m68k_decompose_index): Accept ASHIFT like MULT. (m68k_rtx_costs) [PLUS]: Likewise. (m68k_legitimize_address): Likewise.
2024-08-27c++: Don't show constructor internal name in error message [PR105483]Simon Martin5-9/+20
We mention 'X::__ct' instead of 'X::X' in the "names the constructor, not the type" error for this invalid code: === cut here === struct X {}; void g () { X::X x; } === cut here === The problem is that we use %<%T::%D%> to build the error message, while %qE does exactly what we need since we have DECL_CONSTRUCTOR_P. This is what this patch does. It also skips until the end of the statement and returns error_mark_node for this and the preceding if block, to avoid emitting extra (useless) errors. PR c++/105483 gcc/cp/ChangeLog: * parser.cc (cp_parser_expression_statement): Use %qE instead of incorrect %<%T::%D%>. Skip to end of statement and return error_mark_node in case of error. gcc/testsuite/ChangeLog: * g++.dg/parse/error36.C: Adjust test expectation. * g++.dg/tc1/dr147.C: Likewise. * g++.old-deja/g++.other/typename1.C: Likewise. * g++.dg/diagnostic/pr105483.C: New test.
2024-08-27RISC-V: Move helper functions above expand_const_vectorPatrick O'Neill1-66/+66
These subroutines will be used in expand_const_vector in a future patch. Relocate so expand_const_vector can use them. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vector_init_insert_elems): Relocate. (expand_vector_init_trailing_same_elem): Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27RISC-V: Allow non-duplicate bool patterns in expand_const_vectorPatrick O'Neill1-15/+8
Currently we assert when encountering a non-duplicate boolean vector. This patch allows non-duplicate vectors to fall through to the gcc_unreachable and assert there. This will be useful when adding a catch-all pattern to emit costs and handle arbitary vectors. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Allow non-duplicate to fall through other patterns before asserting. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27RISC-V: Handle 0.0 floating point pattern costing to match const_vector expanderPatrick O'Neill3-6/+15
The comment previously here stated that the Wc0/Wc1 cases are handled by the vi constraint but that is not true for the 0.0 Wc0 case. gcc/ChangeLog: * config/riscv/riscv-v.h (valid_vec_immediate_p): Add new helper. * config/riscv/riscv-v.cc (valid_vec_immediate_p): Ditto. (expand_const_vector): Use new helper. * config/riscv/riscv.cc (riscv_const_insns): Handle 0.0 floating-point case. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27RISC-V: Emit costs for bool and stepped const vectorsPatrick O'Neill3-52/+131
These cases are handled in the expander (riscv-v.cc:expand_const_vector). We need the vector builder to detect these cases so extract that out into a new riscv-v.h header file. gcc/ChangeLog: * config/riscv/riscv-v.cc (class rvv_builder): Move to riscv-v.h. * config/riscv/riscv.cc (riscv_const_insns): Emit placeholder costs for bool/stepped const vectors. * config/riscv/riscv-v.h: New file. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27RISC-V: Handle case when constant vector construction target rtx is not a ↵Patrick O'Neill1-32/+41
register This manifests in RTL that is optimized away which causes runtime failures in the testsuite. Update all patterns to use a temp result register if required. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Use tmp register if needed. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27RISC-V: Reorder insn cost match order to match corresponding expander match ↵Patrick O'Neill1-9/+9
order The corresponding expander (riscv-v.cc:expand_const_vector) matches const_vec_duplicate_p before const_vec_series_p. Reorder to match this behavior when calculating costs. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_const_insns): Relocate. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27RISC-V: Fix vid const vector expander for non-npatterns size stepsPatrick O'Neill1-6/+42
Prior to this patch the expander would emit vectors like: { 0, 0, 5, 5, 10, 10, ...} as: { 0, 0, 2, 2, 4, 4, ...} This patch sets the step size to the requested value. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Fix STEP size in expander. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2024-08-27arm: Always use vmov.f64 instead of vmov.f32 with MVEChristophe Lyon2-11/+5
With MVE, vmov.f64 is always supported (no need for +fp.dp extension). This patch updates two patterns: - in movdi_vfp, we incorrectly checked TARGET_VFP_SINGLE || TARGET_HAVE_MVE instead of TARGET_VFP_SINGLE && !TARGET_HAVE_MVE, and didn't take into account these two possibilities when computing the length attribute. - in thumb2_movdf_vfp, we checked only TARGET_VFP_SINGLE. No need to update movdf_vfp, since it is enabled only for TARGET_ARM (which is not the case when MVE is enabled). The patch also updates gcc.target/arm/armv8_1m-fp64-move-1.c, to accept only vmov.f64 instead of vmov.f32. Tested on arm-none-eabi with: qemu/-mthumb/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto qemu/-mthumb/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto/-march=armv8.1-m.main+mve qemu/-mthumb/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto/-march=armv8.1-m.main+mve.fp qemu/-mthumb/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto/-march=armv8.1-m.main+mve.fp+fp.dp 2024-08-21 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/vfp.md (movdi_vfp, thumb2_movdf_vfp): Handle MVE case. gcc/testsuite/ * gcc.target/arm/armv8_1m-fp64-move-1.c: Update expected code.
2024-08-27pr116174.c: Add the missing */H.J. Lu1-1/+1
* gcc.target/i386/pr116174.c: Add the missing */. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-08-27Extend check-function-bodies to allow label and directivesH.J. Lu3-8/+34
As PR target/116174 shown, we may need to verify labels and the directive order. Extend check-function-bodies to support matched output lines to allow label and directives. gcc/ * doc/sourcebuild.texi (check-function-bodies): Add an optional argument for matched output lines. gcc/testsuite/ * gcc.target/i386/pr116174.c: Use check-function-bodies. * lib/scanasm.exp (parse_function_bodies): Append the line if $up_config(matched) matches the line. (check-function-bodies): Add an argument for matched. Set up_config(matched) to $matched. Append the expected line without $config(line_prefix) to function_regexp if it starts with ".L". Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-08-27LRA: Fix setup_sp_offsetMichael Matz1-5/+8
This is part of making m68k work with LRA. See PR116429. In short: setup_sp_offset is internally inconsistent. It wants to setup the sp_offset for newly generated instructions. sp_offset for an instruction is always the state of the sp-offset right before that instruction. For that it starts at the (assumed correct) sp_offset of the instruction right after the given (new) sequence, and then iterates that sequence forward simulating its effects on sp_offset. That can't ever be right: either it needs to start at the front and simulate forward, or start at the end and simulate backward. The former seems to be the more natural way. Funnily the local variable holding that instruction is also called 'before'. This changes it to the first variant: start before the sequence, do one simulation step to get the sp-offset state in front of the sequence and then continue simulating. More details: in the problematic testcase we start with this situation (sp_off before 550 is 0): 550: [--sp] = 0 sp_off = 0 {pushexthisi_const} 551: [--sp] = 37 sp_off = -4 {pushexthisi_const} 552: [--sp] = r37 sp_off = -8 {movsi_m68k2} 554: [--sp] = r116 - r37 sp_off = -12 {subsi3} 556: call sp_off = -16 insn 554 doesn't match its constraints and needs some reloads: Creating newreg=262, assigning class DATA_REGS to r262 554: r262:SI=r262:SI-r37:SI REG_ARGS_SIZE 0x10 Inserting insn reload before: 996: r262:SI=r116:SI Inserting insn reload after: 997: [--%sp:SI]=r262:SI Considering alt=0 of insn 997: (0) =g (1) damSKT 1 Non pseudo reload: reject++ overall=1,losers=0,rld_nregs=0 Choosing alt 0 in insn 997: (0) =g (1) damSKT {*movsi_m68k2} (sp_off=-16) Note how insn 997 (the after-reload) now has sp_off=-16 already. It all goes downhill from there. We end up with these insns: 552: [--sp] = r37 sp_off = -8 {movsi_m68k2} 996: r262 = r116 sp_off = -12 554: r262 = r262 - r37 sp_off = -12 997: [--sp] = r262 sp_off = -16 (!!! should be -12) 556: call sp_off = -16 The call insn sp_off remains at the correct -16, but internally it's already inconsistent here. If the sp_off before an insn is -16, and that insn pre_decs sp, then the after-insn sp_off should be -20. PR target/116429 * lra.cc (setup_sp_offset): Start with sp_offset from before the new sequence, not from after.
2024-08-27LRA: Don't use 0 as initialization for sp_offsetMichael Matz1-3/+4
this is part of making m68k work with LRA. See PR116374. m68k has the property that sometimes the elimation offset between %sp and %argptr is zero. During setting up elimination infrastructure it's changes between sp_offset and previous_offset that feed into insns_with_changed_offsets that ultimately will setup looking at the instructions so marked. But the initial values for sp_offset and previous_offset are also zero. So if the targets INITIAL_ELIMINATION_OFFSET (called in update_reg_eliminate) is zero then nothing changes, the instructions in question don't get into the list to consider and the sp_offset tracking goes wrong. Solve this by initializing those member with -1 instead of zero. An initial offset of that value seems very unlikely, as it's in word-sized increments. This then also reveals a problem in eliminate_regs_in_insn where it always uses sp_offset-previous_offset as offset adjustment, even in the first_p pass. That was harmless when previous_offset was uninitialized as zero. But all the other code uses a different idiom of checking for first_p (or rather update_p which is !replace_p&&!first_p), and using sp_offset directly. So use that as well in eliminate_regs_in_insn. PR target/116374 * lra-eliminations.cc (init_elim_table): Use -1 as initializer. (update_reg_eliminate): Accept -1 as not-yet-used marker. (eliminate_regs_in_insn): Use previous_sp_offset only when not first_p.
2024-08-27final: go down ASHIFT in walk_alter_subregMichael Matz1-0/+1
when experimenting with m68k plus LRA one of the changes in the backend is to accept ASHIFTs (not only MULT) as scale code for address indices. When then not turning on LRA but using reload those addresses are presented to it which chokes on them. While reload is going away the change to make them work doesn't really hurt (and generally seems useful, as MULT and ASHIFT really are no different). So just add it. PR target/116413 * final.cc (walk_alter_subreg): Recurse on AHIFT.
2024-08-27c++: Add most missing C++20 and C++23 names to cxxapi-data.csvJonathan Wakely3-905/+1288
This includes uncommenting the atomic_flag non-member functions, which were added by PR libstdc++/103934. Also generate a hint for std::ignore, which was recently tweaked to be more generally useful by P2968R2, which r15-2324 implemented. gcc/cp/ChangeLog: * cxxapi-data.csv: Add C++20 and C++23 names from <chrono>, <format>, <generator>, <iterator>, <print>, and <stdfloat>. Set cxx11 dialect for std::ignore in <tuple>. Uncomment atomic_flag functions from <atomic>. * std-name-hint.gperf: Regenerate. * std-name-hint.h: Regenerate.
2024-08-27c++: Add correct copyright dates to output of gen-cxxapi-file.pyJonathan Wakely1-1/+1
This ensures the generated output says something like 2022-2024 rather than just 2024. gcc/cp/ChangeLog: * gen-cxxapi-file.py: Fix copyright dates in generated output.
2024-08-27testsuite: Fix ending of comment in test casesTorbjörn SVENSSON21-21/+21
gcc/testsuite/ChangeLog: * gcc.dg/pr108757-1.c: Fixed dg-comment. * gcc.dg/pr71071.c: Likewise. * gcc.dg/tree-ssa/noreturn-1.c: Likewise. * gcc.dg/tree-ssa/pr56727.c: Likewise. * gcc.target/arc/loop-2.cpp: Likewise. * gcc.target/arc/loop-3.c: Likewise. * gcc.target/arc/pr9001107555.c: Likewise. * gcc.target/arm/armv8_1m-fp16-move-1.c: Likewise. * gcc.target/arm/armv8_1m-fp32-move-1.c: Likewise. * gcc.target/arm/armv8_1m-fp64-move-1.c: Likewise. * gcc.target/i386/amxint8-asmatt-1.c: Likewise. * gcc.target/i386/amxint8-asmintel-1.c: Likewise. * gcc.target/i386/avx512bw-vpermt2w-1.c: Likewise. * gcc.target/i386/avx512vbmi-vpermt2b-1.c: Likewise. * gcc.target/i386/endbr_immediate.c: Likewise. * gcc.target/i386/pr96539.c: Likewise. * gcc.target/i386/sse2-pr98461-2.c: Likewise. * gcc.target/m68k/pr39726.c: Likewise. * gcc.target/m68k/pr52076-1.c: Likewise. * gcc.target/m68k/pr52076-2.c: Likewise. * gcc.target/nvptx/v2si-vec-set-extract.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
2024-08-27Un-XFAIL 'gcc.dg/signbit-5.c' for GCNThomas Schwinge1-1/+0
It XPASSes after recent commit 5a3387938d4d95717cac29eecd0ba53e0ef9094d "testsuite: Add -fwrapv to signbit-5.c". gcc/testsuite/ * gcc.dg/signbit-5.c: Un-XFAIL for GCN.
2024-08-27Handle arithmetic on eliminated address indices [PR116413]Richard Sandiford2-10/+33
This patch fixes gcc.c-torture/compile/opout.c for m68k with LRA enabled. The test has: ... z (a, b) { return (int) &a + (int) &b + (int) x + (int) z; } so it adds the address of two incoming arguments. This ends up being treated as an LEA in which the "index" is the incoming argument pointer, which the LEA multiplies by 2. The incoming argument pointer is then eliminated, leading to: (plus:SI (plus:SI (ashift:SI (plus:SI (reg/f:SI 24 %argptr) (const_int -4 [0xfffffffffffffffc])) (const_int 1 [0x1])) (reg/f:SI 41 [ _6 ])) (const_int 20 [0x14])) In the address_info scheme, the innermost plus has to be treated as the index "term", since that's the thing that's subject to index_reg_class. gcc/ PR middle-end/116413 * rtl.h (address_info): Update commentary. * rtlanal.cc (valid_base_or_index_term_p): New function, split out from... (get_base_term, get_index_term): ...here. Handle elimination PLUSes.
2024-08-27lra: Don't apply eliminations to allocated registers [PR116321]Richard Sandiford1-9/+9
The sequence of events in this PR is that: - the function has many addresses in which only a single hard base register is acceptable. Let's call the hard register H. - IRA allocates that register to one of the pseudo base registers. Let's call the pseudo register P. - Some of the other addresses that require H occur when P is still live. - LRA therefore has to spill P. - When it reallocates P, LRA chooses to use FRAME_POINTER_REGNUM, which has been eliminated to the stack pointer. (This is ok, since the frame register is free.) - Spilling P causes LRA to reprocess the instruction that uses P. - When reprocessing the address that has P as its base, LRA first applies the new allocation, to get FRAME_POINTER_REGNUM, and then applies the elimination, to get the stack pointer. The last step seems wrong: the elimination should only apply to pre-existing uses of FRAME_POINTER_REGNUM, not to uses that result from allocating pseudos. Applying both means that we get the wrong register number, and therefore the wrong class. The PR is about an existing testcase that fails with LRA on m86k. gcc/ PR middle-end/116321 * lra-constraints.cc (get_hard_regno): Only apply eliminations to existing hard registers. (get_reg_class): Likewise.
2024-08-27c++, coroutines: The frame pointer is used in the helpers [PR116482].Iain Sandoe2-0/+31
We have a bogus warning about the coroutine state frame pointers being apparently unused in the resume and destroy functions. Fixed by making the parameters DECL_ARTIFICIAL. PR c++/116482 gcc/cp/ChangeLog: * coroutines.cc (coro_build_actor_or_destroy_function): Make the parameter decls DECL_ARTIFICIAL. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr116482.C: New test. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2024-08-27tree-optimization/116460 - ICE with DCE in forwpropRichard Biener2-10/+637
The following avoids removing stmts with defs that might still have uses in the IL before calling simple_dce_from_worklist which might remove those as that will wreck debug stmt generation. Instead first perform use-based DCE and then remove stmts which may have uses in code that CFG cleanup will remove. This requires tracking stmts in to_remove by their SSA def so we can check whether it was removed before without running into the issue that PHIs can be ggc_free()d upon removal. So this adds to_remove_defs in addition to to_remove which has to stay to track GIMPLE_NOPs we want to elide. PR tree-optimization/116460 * tree-ssa-forwprop.cc (pass_forwprop::execute): First do simple_dce_from_worklist and then remove stmts in to_remove. Track defs to be removed in to_remove_defs. * g++.dg/torture/pr116460.C: New testcase.
2024-08-27Fix another inline7.c test failure on sparc targetsBernd Edlinger1-1/+1
This new test was reported to be still failing on sparc targets. Here the number of DW_AT_ranges dropped to zero. The test should pass on this architecture with -Os, -O2 and -O3. I tried to improve also different known problematic targets, where only one subroutine had DW_AT_ranges: Those are armhf (arm with hard float), powerpc and powerpc64. The best option is to use -Os: So far the only one, where all two inline instances in this test had two DW_AT_ranges. gcc/testsuite/ChangeLog: PR other/116462 * gcc.dg/debug/dwarf2/inline7.c: Switch to -Os optimization.
2024-08-27RISC-V: Support IMM for operand 1 of ussub patternPan Li17-2/+423
This patch would like to allow IMM for the operand 1 of ussub pattern. Aka .SAT_SUB(x, 22) as the below example. Form 2: #define DEF_SAT_U_SUB_IMM_FMT_2(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \ { \ return x >= (T)IMM ? x - (T)IMM : 0; \ } DEF_SAT_U_SUB_IMM_FMT_2(uint64_t, 1022) It is almost the as support imm for operand 0 of ussub pattern, but allow the second operand to be imm insted of the first operand. The below test suites are passed for this patch: 1. The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_expand_ussub): Gen xmode for the second operand, aka y in parameter. * config/riscv/riscv.md (ussub<mode>3): Allow const_int for operand 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_u_sub_imm-5.c: New test. * gcc.target/riscv/sat_u_sub_imm-5_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-5_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-6.c: New test. * gcc.target/riscv/sat_u_sub_imm-6_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-6_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-7.c: New test. * gcc.target/riscv/sat_u_sub_imm-7_1.c: New test. * gcc.target/riscv/sat_u_sub_imm-7_2.c: New test. * gcc.target/riscv/sat_u_sub_imm-8.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-5.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-6.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-7.c: New test. * gcc.target/riscv/sat_u_sub_imm-run-8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-27c++/modules: Fix include translation for already-seen headers [PR99243]Nathaniel Shead5-6/+31
After importing a header unit we learn about and setup any header modules that we transitively depend on. However, this causes 'set_filename' to fail an assertion if we then come across this header as an #include and attempt to translate it into a module. We still need to do this translation so that libcpp learns that this is a header unit, but we shouldn't error just because we've already seen it as an import. Instead this patch merely checks and errors to handle the case of a broken mapper implementation which supplies a different CMI path from the one we already got. As a drive-by fix, also make failing to find the CMI for a module be a fatal error: any further errors in the TU are unlikely to be helpful. PR c++/99243 gcc/cp/ChangeLog: * module.cc (module_state::set_filename): Handle repeated calls to 'set_filename' as long as the CMI path matches. (maybe_translate_include): Adjust comment. gcc/testsuite/ChangeLog: * g++.dg/modules/map-2.C: Prune additional fatal error message. * g++.dg/modules/inc-xlate-4_a.H: New test. * g++.dg/modules/inc-xlate-4_b.H: New test. * g++.dg/modules/inc-xlate-4_c.H: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-08-27c++/modules: Clean up include translation [PR110980]Nathaniel Shead5-9/+29
Currently the handling of include translation is confusing to read, using a tri-state integer without much clarity on what different states mean. This patch cleans this up to use explicit enumerators indicating the different possible states instead, and fixes a bug where the option '-flang-info-include-translate' ended being accidentally unusable. PR c++/110980 gcc/cp/ChangeLog: * module.cc (maybe_translate_include): Clean up. gcc/testsuite/ChangeLog: * g++.dg/modules/inc-xlate-2_a.H: New test. * g++.dg/modules/inc-xlate-2_b.H: New test. * g++.dg/modules/inc-xlate-3.h: New test. * g++.dg/modules/inc-xlate-3_a.H: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-08-27combine.cc (make_more_copies): Copy attributes from the original pseudo, ↵Hans-Peter Nilsson1-0/+6
PR115883 The first of the late-combine passes, propagates some of the copies made during the (in-time-)combine pass in make_more_copies into the users of the "original" pseudo registers and removes the "old" pseudos. That effectively removes attributes such as REG_POINTER, which matter to LRA. The quoted PR is for an ICE-manifesting bug that was exposed by the late-combine pass and went back to hiding with this patch until commit r15-2937-g3673b7054ec2, the fix for PR116236, when it was actually fixed. To wit, this patch is only incidentally related to that bug. In other words, the REG_POINTER attribute should not be required for LRA to work correctly. This patch merely corrects state for those propagated register-uses to ante late-combine. For reasons not investigated, this fixes a failing test "FAIL: gcc.dg/guality/pr54200.c -Og -DPREVENT_OPTIMIZATION line 20 z == 3" for x86_64-linux-gnu. PR middle-end/115883 * combine.cc (make_more_copies): Copy attributes from the original pseudo to the new copy.
2024-08-27c++/coros: do not assume coros don't nest [PR113457]Arsen Arsenović3-6/+216
In the testcase presented in the PR, during template expansion, an tsubst of an operand causes a lambda coroutine to be processed, causing it to get an initial suspend and final suspend. The code for assigning awaitable var names (get_awaitable_var) assumed that the sequence Is -> Is -> Fs -> Fs is impossible (i.e. that one could only 'open' one coroutine before closing it at a time), and reset the counter used for unique numbering each time a final suspend occured. This assumption is false in a few cases, usually when lambdas are involved. Instead of storing this counter in a static-storage variable, we can store it in coroutine_info. This struct is local to each function, so we don't need to worry about "cross-contamination" nor resetting. PR c++/113457 gcc/cp/ChangeLog: * coroutines.cc (struct coroutine_info): Add integer field awaitable_number. This is a counter used for assigning unique names to awaitable temporaries. (get_awaitable_var): Use awaitable_number from coroutine_info instead of the static int awn. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr113457-1.C: New test. * g++.dg/coroutines/pr113457.C: New test.
2024-08-26coroutines: diagnose usage of alloca in coroutinesArsen Arsenović2-0/+33
We do not support it currently, and the resulting memory can only be used inside a single resumption, so best not confuse the user with it. PR c++/115858 - Incompatibility of coroutines and alloca() gcc/ChangeLog: * coroutine-passes.cc (execute_early_expand_coro_ifns): Emit a sorry if a statement is an alloca call. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr115858.C: New test.
2024-08-26diagnostics: move output formats from diagnostic.{c,h} to their own filesDavid Malcolm12-257/+359
In particular, move the classic text output code to a diagnostic-text.cc (analogous to -json.cc and -sarif.cc). No functional change intended. gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add diagnostic-format-text.o. * diagnostic-format-json.cc: Include "diagnostic-format.h". * diagnostic-format-sarif.cc: Likewise. * diagnostic-format-text.cc: New file, using material from diagnostics.cc. * diagnostic-global-context.cc: Include "diagnostic-format.h". * diagnostic-format-text.h: New file, using material from diagnostics.h. * diagnostic-format.h: New file, using material from diagnostics.h. * diagnostic.cc: Include "diagnostic-format.h" and "diagnostic-format-text.h". (diagnostic_text_output_format::~diagnostic_text_output_format): Move to diagnostic-format-text.cc. (diagnostic_text_output_format::on_report_diagnostic): Likewise. (diagnostic_text_output_format::on_diagram): Likewise. (diagnostic_text_output_format::print_any_cwe): Likewise. (diagnostic_text_output_format::print_any_rules): Likewise. (diagnostic_text_output_format::print_option_information): Likewise. * diagnostic.h (class diagnostic_output_format): Move to diagnostic-format.h. (class diagnostic_text_output_format): Move to diagnostic-format-text.h. (diagnostic_output_format_init): Move to diagnostic-format.h. (diagnostic_output_format_init_json_stderr): Likewise. (diagnostic_output_format_init_json_file): Likewise. (diagnostic_output_format_init_sarif_stderr): Likewise. (diagnostic_output_format_init_sarif_file): Likewise. (diagnostic_output_format_init_sarif_stream): Likewise. * gcc.cc: Include "diagnostic-format.h". * opts.cc: Include "diagnostic-format.h". gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic_group_plugin.c: Include "diagnostic-format-text.h". Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26diagnostics: consolidate on_{begin,end}_diagnostic into on_report_diagnosticDavid Malcolm4-176/+171
Previously diagnostic_context::report_diagnostic had, after the call to pp_format (phases 1 and 2 of formatting the message): m_output_format->on_begin_diagnostic (*diagnostic); pp_output_formatted_text (this->printer, m_urlifier); if (m_show_cwe) print_any_cwe (*diagnostic); if (m_show_rules) print_any_rules (*diagnostic); if (m_show_option_requested) print_option_information (*diagnostic, orig_diag_kind); m_output_format->on_end_diagnostic (*diagnostic, orig_diag_kind); This patch replaces all of the above with a single call to m_output_format->on_report_diagnostic (*diagnostic, orig_diag_kind); moving responsibility for phase 3 of formatting and printing the result from diagnostic_context to the output format. This simplifies diagnostic_context::report_diagnostic and allows us to move the code that prints CWEs, rules, and option information in textual form from diagnostic_context to diagnostic_text_output_format, where it belongs. No functional change intended. gcc/ChangeLog: * diagnostic-format-json.cc (json_output_format::on_begin_diagnostic): Delete. (json_output_format::on_end_diagnostic): Rename to... (json_output_format::on_report_diagnostic): ...this and add call to pp_output_formatted_text. (diagnostic_output_format_init_json): Drop unnecessary calls to disable textual printing of CWEs, rules, and options. * diagnostic-format-sarif.cc (sarif_builder::end_diagnostic): Rename to... (sarif_builder::on_report_diagnostic): ...this and add call to pp_output_formatted_text. (sarif_output_format::on_begin_diagnostic): Delete. (sarif_output_format::on_end_diagnostic): Rename to... (sarif_output_format::on_report_diagnostic): ...this and update call to m_builder accordingly. (diagnostic_output_format_init_sarif): Drop unnecessary calls to disable textual printing of CWEs, rules, and options. * diagnostic.cc (diagnostic_context::print_any_cwe): Convert to... (diagnostic_text_output_format::print_any_cwe): ...this. (diagnostic_context::print_any_rules): Convert to... (diagnostic_text_output_format::print_any_rules): ...this. (diagnostic_context::print_option_information): Convert to... (diagnostic_text_output_format::print_option_information): ...this. (diagnostic_context::report_diagnostic): Replace calls to the output format's on_begin_diagnostic, to pp_output_formatted_text, printing CWE, rules, option info, and the call to the format's on_end_diagnostic with a call to the format's on_report_diagnostic. (diagnostic_text_output_format::on_begin_diagnostic): Delete. (diagnostic_text_output_format::on_end_diagnostic): Delete. (diagnostic_text_output_format::on_report_diagnostic): New vfunc, which effectively does the on_begin_diagnostic, the call to pp_output_formatted_text, the calls for printing CWE, rules, option info, and the call to the diagnostic_finalizer. * diagnostic.h (diagnostic_output_format::on_begin_diagnostic): Delete. (diagnostic_output_format::on_end_diagnostic): Delete. (diagnostic_output_format::on_report_diagnostic): New. (diagnostic_text_output_format::on_begin_diagnostic): Delete. (diagnostic_text_output_format::on_end_diagnostic): Delete. (diagnostic_text_output_format::on_report_diagnostic): New. (class diagnostic_context): Add friend class diagnostic_text_output_format. (diagnostic_context::get_urlifier): New accessor. (diagnostic_context::print_any_cwe): Move decl... (diagnostic_text_output_format::print_any_cwe): ...to here. (diagnostic_context::print_any_rules): Move decl... (diagnostic_text_output_format::print_any_rules): ...to here. (diagnostic_context::print_option_information): Move decl... (diagnostic_text_output_format::print_option_information): ...to here. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26testsuite: add event IDs to multithreaded event plugin testDavid Malcolm4-20/+38
Add test coverage of "%@" in event messages in a multithreaded execution path. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-paths-multithreaded-inline-events.c: Update expected output. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py: Likewise. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-separate-events.c: Likewise. * gcc.dg/plugin/diagnostic_plugin_test_paths.c (test_diagnostic_path::add_event_2): Return the id of the added event. (test_diagnostic_path::add_event_2_with_event_id): New. (example_4): Add event IDs to the deadlock messages indicating where the locks where acquired. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26testsuite: generalize support for Python tests for SARIF outputDavid Malcolm8-16/+208
In r15-2354-g4d1f71d49e396c I added the ability to use Python to write tests of SARIF output via a new "run-sarif-pytest" based on "run-gcov-pytest", with a sarif.py support script in testsuite/gcc.dg/sarif-output. This followup patch: (a) removes the limitation of such tests needing to be in testsuite/gcc.dg/sarif-output by moving sarif.py to testsuite/lib and adding logic to add that directory to PYTHONPATH when invoking pytest. (b) uses this to replace fragile regexp-based tests in gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c with Python logic that verifies the structure within the generated JSON, and to add test coverage for SARIF output relating to GCC plugins. gcc/ChangeLog: * diagnostic-format-sarif.cc: Add comments noting that we don't yet capture any diagnostic_metadata::rules associated with a diagnostic. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-metadata-sarif.c: New test, based on diagnostic-test-metadata.c. * gcc.dg/plugin/diagnostic-test-metadata-sarif.py: New script. * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.c: Replace scan-sarif-file directives with run-sarif-pytest, to run... * gcc.dg/plugin/diagnostic-test-paths-multithreaded-sarif.py: ...this new test. * gcc.dg/plugin/plugin.exp (plugin_test_list): Add diagnostic-test-metadata-sarif.c. * gcc.dg/sarif-output/sarif.py: Move to... * lib/sarif.py: ...here. * lib/scansarif.exp (run-sarif-pytest): Prepend "lib" to PYTHONPATH before running python scripts. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26pretty-print: fixes to selftestsDavid Malcolm1-4/+35
Add selftest coverage for %{ and %} in pretty-print.cc No functional change intended. gcc/ChangeLog: * pretty-print.cc (selftest::test_urls): Make static. (selftest::test_urls_from_braces): New. (selftest::test_null_urls): Make static. (selftest::test_urlification): Likewise. (selftest::pretty_print_cc_tests): Call test_urls_from_braces. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26json.h: fix typo in commentDavid Malcolm1-1/+1
gcc/ChangeLog: * json.h: Fix typo in comment about missing INCLUDE_MEMORY. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-08-26c++: Check template parameters in member class template specialization ↵Simon Martin3-0/+40
[PR115716] We currently ICE upon the following invalid code, because we don't check that the template parameters in a member class template specialization are correct. === cut here === template <typename T> struct x { template <typename U> struct y { typedef T result2; }; }; template<> template<typename U, typename> struct x<int>::y { typedef double result2; }; int main() { x<int>::y<int>::result2 xxx2; } === cut here === This patch fixes the PR by calling redeclare_class_template. PR c++/115716 gcc/cp/ChangeLog: * pt.cc (maybe_process_partial_specialization): Call redeclare_class_template. gcc/testsuite/ChangeLog: * g++.dg/template/spec42.C: New test. * g++.dg/template/spec43.C: New test.
2024-08-26Remove an unneeded include that was added by mistake.Andi Kleen1-1/+0
gcc/ChangeLog: * tree-if-conv.cc: Remove unneeded include from last change.
2024-08-26Fix bootstap-errors due to enabling -gvariable-location-viewsBernd Edlinger3-3/+3
This recent change triggered various bootstap-errors, mostly on x86 targets because line info advance address entries were output in the wrong section table. The switch to the wrong line table happened in dwarfout_set_ignored_loc. It must use the same section as the earlier called dwarf2out_switch_text_section. But also ft32-elf was affected, because the assembler choked on something simple as ".2byte .LM2-.LM1", but fortunately it is able to use native location views, the configure test was just not executed because the ft32 "nop" instruction was missing. gcc/ChangeLog: PR debug/116470 * configure.ac: Add the "nop" instruction for cpu type ft32. * configure: Regenerate. * dwarf2out.cc (dwarf2out_set_ignored_loc): Use the correct line info section.
2024-08-26tree-optimization/116460 - improve forwprop compile-timeRichard Biener1-6/+7
The following improves forwprop block reachability which I noticed when debugging PR116460 and what is also noted in the comment. It avoids processing blocks in natural loops determined unreachable, thereby making the issue in PR116460 latent. PR tree-optimization/116460 * tree-ssa-forwprop.cc (pass_forwprop::execute): Do not process blocks in unreachable natural loops.
2024-08-26Delay edge removal in forwpropRichard Biener1-9/+25
SSA forwprop has switch simplification code that calls remove edge and as side-effect releases dominator info. For a followup we want to retain that so the following delays removing edges until the end of the pass. As usual we have to deal with parts of the edge vanishing due to EH/abnormal pruning so record edges as basic-block index pairs and remove them only when they are still there. * tree-ssa-forwprop.cc (simplify_gimple_switch_label_vec): Delay removing edges and releasing dominator info, instead record into edges_to_remove vector. (simplify_gimple_switch): Pass through vector of to remove edges. (pass_forwprop::execute): Likewise. Remove queued edges.
2024-08-26vect: Fix STMT_VINFO_DEF_TYPE check for odd/even widen mult [PR116348]Xi Ruoyao2-2/+15
After fixing PR116142 some code started to trigger an ICE with -O3 -march=znver4. Per Richard Biener who actually made this fix: "supportable_widening_operation fails at transform time - that's likely because vectorizable_reduction "puns" defs to internal_def" so the check should use STMT_VINFO_REDUC_DEF instead of checking if STMT_VINFO_DEF_TYPE is vect_reduction_def. gcc/ChangeLog: PR tree-optimization/116348 * tree-vect-stmts.cc (supportable_widening_operation): Use STMT_VINFO_REDUC_DEF (x) instead of STMT_VINFO_DEF_TYPE (x) == vect_reduction_def. gcc/testsuite/ChangeLog: PR tree-optimization/116348 * gcc.c-torture/compile/pr116438.c: New test. Co-authored-by: Richard Biener <rguenther@suse.de>
2024-08-26Match: Add int type fits check for .SAT_ADD imm operandPan Li58-9/+443
This patch would like to add strict check for imm operand of .SAT_ADD matching. We have no type checking for imm operand in previous, which may result in unexpected IL to be catched by .SAT_ADD pattern. We leverage the int_fits_type_p here to make sure the imm operand is a int type fits the result type of the .SAT_ADD. For example: Fits uint8_t: uint8_t a; uint8_t sum = .SAT_ADD (a, 12); uint8_t sum = .SAT_ADD (a, 12u); uint8_t sum = .SAT_ADD (a, 126u); uint8_t sum = .SAT_ADD (a, 128u); uint8_t sum = .SAT_ADD (a, 228); uint8_t sum = .SAT_ADD (a, 223u); Not fits uint8_t: uint8_t a; uint8_t sum = .SAT_ADD (a, -1); uint8_t sum = .SAT_ADD (a, 256u); uint8_t sum = .SAT_ADD (a, 257); The below test suite are passed for this patch: * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Add int_fits_type_p check for .SAT_ADD imm operand. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add test helper macros. * gcc.target/riscv/sat_u_add_imm-11.c: Adjust test case for imm. * gcc.target/riscv/sat_u_add_imm-12.c: Ditto. * gcc.target/riscv/sat_u_add_imm-15.c: Ditto. * gcc.target/riscv/sat_u_add_imm-16.c: Ditto. * gcc.target/riscv/sat_u_add_imm_type_check-1.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-10.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-11.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-12.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-13.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-14.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-15.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-16.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-17.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-18.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-19.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-2.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-20.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-21.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-22.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-23.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-24.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-25.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-26.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-27.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-28.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-29.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-3.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-30.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-31.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-32.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-33.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-34.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-35.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-36.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-37.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-38.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-39.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-4.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-40.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-41.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-42.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-43.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-44.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-45.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-46.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-47.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-48.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-49.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-5.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-50.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-51.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-52.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-6.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-7.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-8.c: New test. * gcc.target/riscv/sat_u_add_imm_type_check-9.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-08-26expand: Use the correct mode for store flags for popcount [PR116480]Andrew Pinski3-1/+18
When expanding popcount used for equal to 1 (or rather __builtin_stdc_has_single_bit), the wrong mode was bsing used for the mode of the store flags. We were using the mode of the argument to popcount but since popcount's return value is always int, the mode of the expansion here should have been the mode of the return type rater than the argument. Built and tested on aarch64-linux-gnu with no regressions. Also bootstrapped and tested on x86_64-linux-gnu. PR middle-end/116480 gcc/ChangeLog: * internal-fn.cc (expand_POPCOUNT): Use the correct mode for store flags. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr116480-1.c: New test. * gcc.dg/torture/pr116480-2.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-08-26i386: Add bf8 -> fp16 intrinHaochen Jiang4-5/+109
Since BF8 and FP16 have same bits for exponent, the type conversion between them is just a cast for fraction part. We will use a sequence of instrctions instead of new instructions to do that. For convenience, intrins are also provided. gcc/ChangeLog: * config/i386/avx10_2-512convertintrin.h (_mm512_cvtpbf8_ph): New. (_mm512_mask_cvtpbf8_ph): Ditto. (_mm512_maskz_cvtpbf8_ph): Ditto. * config/i386/avx10_2convertintrin.h (_mm_cvtpbf8_ph): Ditto. (_mm_mask_cvtpbf8_ph): Ditto. (_mm_maskz_cvtpbf8_ph): Ditto. (_mm256_cvtpbf8_ph): Ditto. (_mm256_mask_cvtpbf8_ph): Ditto. (_mm256_maskz_cvtpbf8_ph): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-convert-1.c: Add tests for new intrin. * gcc.target/i386/avx10_2-convert-1.c: Ditto.
2024-08-26AVX10.2: Support compare instructionsZhang, Jun4-27/+183
gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_ssecom_setcc): Mention behavior change on flags. (ix86_expand_sse_comi): Handle AVX10.2 behavior. (ix86_expand_sse_comi_round): Ditto. (ix86_expand_round_builtin): Ditto. (ix86_expand_builtin): Change function call. * config/i386/i386.md (UNSPEC_COMX): New unspec. * config/i386/sse.md (avx10_2_v<unord>comx<ssemodesuffix><round_saeonly_name>): New. (<sse>_<unord>comi<round_saeonly_name>): Add HFmode. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-compare-1.c: New test. Co-authored-by: Haochen Jiang <haochen.jiang@intel.com> Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
2024-08-26AVX10.2: Support vector copy instructionsZhang, Jun9-53/+356
gcc/ChangeLog: * config.gcc: Add avx10_2copyintrin.h. * config/i386/i386.md (avx10_2): New isa attribute. * config/i386/immintrin.h: Include avx10_2copyintrin.h. * config/i386/sse.md (sse_movss_<mode>): Add new constraints to handle AVX10.2. (vec_set<mode>_0): Ditto. (@vec_set<mode>_0): Ditto. (vec_set<mode>_0): Ditto. (avx512fp16_mov<mode>): Ditto. (*vec_set<mode>_0_1): New split. * config/i386/avx10_2copyintrin.h: New file. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-vmovd-1.c: New test. * gcc.target/i386/avx10_2-vmovd-2.c: Ditto. * gcc.target/i386/avx10_2-vmovw-1.c: Ditto. * gcc.target/i386/avx10_2-vmovw-2.c: Ditto.
2024-08-26AVX10.2: Support minmax instructionsMo, Zewei28-1/+2555
gcc/ChangeLog: * config.gcc: Add avx10_2-512minmaxintrin.h and avx10_2minmaxintrin.h. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (V8BF, V8BF, V8BF, INT, V8BF, UQI), (V16BF, V16BF, V16BF, INT, V16BF, UHI), (V32BF, V32BF, V32BF, INT, V32BF, USI), (V8HF, V8HF, V8HF, INT, V8HF, UQI), (V8DF, V8DF, V8DF, INT, V8DF, UQI, INT), (V32HF, V32HF, V32HF, INT, V32HF, USI, INT), (V16HF, V16HF, V16HF, INT, V16HF, UHI, INT), (V16SF, V16SF, V16SF, INT, V16SF, UHI, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle V8BF_FTYPE_V8BF_V8BF_INT_V8BF_UQI, V16BF_FTYPE_V16BF_V16BF_INT_V16BF_UHI, V32BF_FTYPE_V32BF_V32BF_INT_V32BF_USI, V8HF_FTYPE_V8HF_V8HF_INT_V8HF_UQI, (ix86_expand_round_builtin): Handle V8DF_FTYPE_V8DF_V8DF_INT_V8DF_UQI_INT, V32HF_FTYPE_V32HF_V32HF_INT_V32HF_USI_INT, V16HF_FTYPE_V16HF_V16HF_INT_V16HF_UHI_INT. V16SF_FTYPE_V16SF_V16SF_INT_V16SF_UHI_INT. * config/i386/immintrin.h: Include avx10_2-512mixmaxintrin.h and avx10_2minmaxintrin.h. * config/i386/sse.md (VFH_AVX10_2): New. (avx10_2_vminmaxnepbf16_<mode><mask_name>): New define_insn. (avx10_2_minmaxp<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_minmaxs<mode><mask_scalar_name><round_saeonly_scalar_name>): Ditto. * config/i386/avx10_2-512minmaxintrin.h: New file. * config/i386/avx10_2minmaxintrin.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx512f-helper.h: Add helper function. * gcc.target/i386/avx10-minmax-helper.h: New helper file. * gcc.target/i386/avx10_2-512-minmax-1.c: New test. * gcc.target/i386/avx10_2-512-vminmaxnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxps-2.c: Ditto. * gcc.target/i386/avx10_2-minmax-1.c: Ditto. * gcc.target/i386/avx10_2-vminmaxnepbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsh-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxss-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto. Co-authored-by: Hu, Lin1 <lin1.hu@intel.com> Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
2024-08-26[PATCH 2/2] AVX10.2: Support saturating convert instructionsHu, Lin131-1/+2830
gcc/ChangeLog: * config/i386/avx10_2-512satcvtintrin.h: Add new intrin. * config/i386/avx10_2satcvtintrin.h: Ditto. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/sse.md (VF1_VF2_AVX10_2): New iterator. (VF2_AVX10_2): Ditto. (VI8_AVX10_2): Ditto. (sat_cvt_sign_prefix): Add new UNSPEC. (UNSPEC_SAT_CVT_DS_SIGN_ITER): New iterator. (pd2dqssuff): Ditto. (avx10_2_vcvtt<castmode>2<sat_cvt_sign_prefix>dqs<mode><mask_name><round_saeonly_name>): New. (avx10_2_vcvttpd2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_vcvttps2<sat_cvt_sign_prefix>qqs<mode><mask_name><round_saeonly_name>): Ditto. (avx10_2_vcvttsd2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>): Ditto. (avx10_2_vcvttss2<sat_cvt_sign_prefix>sis<mode><round_saeonly_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add macros. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/avx10_2-satcvt-1.c: Add test. * gcc.target/i386/avx10_2-512-satcvt-1.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2dqs-2.c: New test. * gcc.target/i386/avx10_2-512-vcvttpd2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttpd2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-512-vcvttps2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2usis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2usis-2.c: Ditto.