aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-06-03testsuite/115304 - properly guard gcc.dg/vect/slp-gap-1.cRichard Biener1-1/+1
Testing on sparc shows we need vect_unpack and vect_perm. This isn't enough to resolve the GCN fail which ends up using interleaving. PR testsuite/115304 * gcc.dg/vect/slp-gap-1.c: Require vect_unpack and vect_perm.
2024-06-03install.texi (gcn): Fix date of recommended newlib versionTobias Burnus1-1/+1
gcc/ChangeLog: * doc/install.texi (gcn): Fix date of recommended newlib version.
2024-06-03aarch64: adjust enum writeback after renameMarc Poulhiès1-2/+2
gcc/ChangeLog: * config/aarch64/aarch64-ldp-fusion.cc (struct aarch64_pair_fusion): Use new type name.
2024-06-03pair-fusion: fix for older GCCMarc Poulhiès2-5/+5
Older GCCs fail with: .../gcc/pair-fusion.cc: In member function ‘bool pair_fusion_bb_info::fuse_pair(bool, unsigned int, int, rtl_ssa::insn_info*, rtl_ssa::in sn_info*, base_cand&, const rtl_ssa::insn_range_info&)’: .../gcc/pair-fusion.cc:1790:40: error: ‘writeback’ is not a class, namespace, or enumeration if (m_pass->should_handle_writeback (writeback::ALL) Renaming the enum type works around the name conflict with the local variable and also prevents future similar conflicts. gcc/ChangeLog: * pair-fusion.h (enum class writeback): Rename to... (enum class writeback_type): ...this. (struct pair_fusion): Adjust type name after renaming. * pair-fusion.cc (pair_fusion_bb_info::track_access): Likewise. (pair_fusion_bb_info::fuse_pair): Likewise. (pair_fusion::process_block): Likewise.
2024-06-03testsuite: Require vect_shift in gcc.dg/vect/pr112325.c [PR115303]Rainer Orth1-0/+1
The new gcc.dg/vect/pr112325.c test FAILs on Solaris/SPARC: FAIL: gcc.dg/vect/pr112325.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/vect/pr112325.c scan-tree-dump-times vect "vectorized 1 loops" 1 As analyzed in the PR, the test requires vect_shift, so this patch adds that requirement. Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11. 2024-06-03 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: PR tree-optimization/115303 * gcc.dg/vect/pr112325.c: Require vect_shift.
2024-06-03Adjust vector dump scansRichard Biener19-22/+22
The following adjusts dump scanning for something followed by successful vector analysis to more specifically look for 'Analysis succeeded' and not 'Analysis failed' because the previous look for just 'succeeded' or 'failed' is easily confused by SLP discovery dumping those words. * tree-vect-loop.cc (vect_analyze_loop_1): Avoid extra space before 'failed'. * gcc.dg/vect/no-scevccp-outer-7.c: Adjust scanning for succeeded analysis not interrupted by failure. * gcc.dg/vect/no-scevccp-vect-iv-3.c: Likewise. * gcc.dg/vect/vect-cond-reduc-4.c: Likewise. * gcc.dg/vect/vect-live-2.c: Likewise. * gcc.dg/vect/vect-outer-4c-big-array.c: Likewise. * gcc.dg/vect/vect-reduc-dot-s16a.c: Likewise. * gcc.dg/vect/vect-reduc-dot-s8a.c: Likewise. * gcc.dg/vect/vect-reduc-dot-s8b.c: Likewise. * gcc.dg/vect/vect-reduc-dot-u16a.c: Likewise. * gcc.dg/vect/vect-reduc-dot-u16b.c: Likewise. * gcc.dg/vect/vect-reduc-dot-u8a.c: Likewise. * gcc.dg/vect/vect-reduc-dot-u8b.c: Likewise. * gcc.dg/vect/vect-reduc-pattern-1a.c: Likewise. * gcc.dg/vect/vect-reduc-pattern-1b-big-array.c: Likewise. * gcc.dg/vect/vect-reduc-pattern-1c-big-array.c: Likewise. * gcc.dg/vect/vect-reduc-pattern-2a.c: Likewise. * gcc.dg/vect/vect-reduc-pattern-2b-big-array.c: Likewise. * gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c: Likewise.
2024-06-03Avoid ICE with pointer reductionRichard Biener1-8/+7
There's another case where we can refer to neutral_op before eventually converting it from pointer to integer so simply do that unconditionally. * tree-vect-loop.cc (get_initial_defs_for_reduction): Always convert neutral_op.
2024-06-03Add some preference for floating point rtl ifcvt when sse4.1 is not availableliuhongt3-1/+28
W/o TARGET_SSE4_1, it takes 3 instructions (pand, pandn and por) for movdfcc/movsfcc, and could possibly fail cost comparison. Increase branch cost could hurt performance for other modes, so specially add some preference for floating point ifcvt. gcc/ChangeLog: PR target/115299 * config/i386/i386.cc (ix86_noce_conversion_profitable_p): Add some preference for floating point ifcvt when SSE4.1 is not available. gcc/testsuite/ChangeLog: * gcc.target/i386/pr115299.c: New test. * gcc.target/i386/pr86722.c: Adjust testcase.
2024-06-03Add AVX10.1 target_clones supportHaochen Jiang5-5/+26
Since AVX10 is the first major ISA introduced after AVX-512, we propose to add target_clones support for it. Although AVX10.1-256 won't cover 512-bit part of AVX512F, but since it is only for priority but not for implication, it won't be an issue. gcc/ChangeLog: * common/config/i386/i386-common.cc: Change Granite Rapids series CPU type to P_PROC_AVX10_1_512. * common/config/i386/i386-cpuinfo.h (enum feature_priority): Revise comment part. Add P_AVX10_1_256, P_AVX10_1_512, P_PROC_AVX10_1_512. * common/config/i386/i386-isas.h: Link to avx10.1-256, avx10.1-512. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_1-25.c: New test. * gcc.target/i386/avx10_1-26.c: Ditto.
2024-06-03[APX NF] Support APX NF for lzcnt/tzcnt/popcntLingling Kong1-11/+113
gcc/ChangeLog: * config/i386/i386.md (clz<mode>2_lzcnt_nf): New define_insn. (*clz<mode>2_lzcnt_falsedep_nf): Ditto. (<lt_zcnt>_<mode>_nf): Ditto. (*<lt_zcnt>_<mode>_falsedep_nf): Ditto. (<lt_zcnt>_hi<nf_name>): Ditto. (popcount<mode>2_nf): Ditto. (*popcount<mode>2_falsedep_nf): Ditto. (popcounthi2<nf_name>): Ditto.
2024-06-03[APX NF] Support APX NF for mul/divLingling Kong1-17/+30
gcc/ChangeLog: * config/i386/i386.md (*mul<mode>3_1<nf_name>): New define_insn. (*mulqi3_1<nf_name>): Ditto. (*<u>divmod<mode>4_noext_nf): Ditto. (<u>divmodhiqi3<nf_name>): Ditto.
2024-06-03[APX NF] Support APX NF for shld/shrdLingling Kong1-81/+308
gcc/ChangeLog: * config/i386/i386.md (x86_64_shld): New define_insn. (x86_64_shld<nf_name>): Ditto. (x86_64_shld_ndd<nf_name>): Ditto. (x86_64_shld_1<nf_name>): Ditto. (x86_64_shld_ndd_1<nf_name>): Ditto. (*x86_64_shld_shrd_1_nozext_nf): Ditto. (x86_shld<nf_name>): Ditto. (x86_shld_ndd<nf_name>): Ditto. (x86_shld_1<nf_name>): Ditto. (x86_shld_ndd_1<nf_name>): Ditto. (*x86_shld_shrd_1_nozext_nf): Ditto. (<insn><dwi>3_doubleword_lowpart_nf): Ditto. (x86_64_shrd<nf_name>): Ditto. (x86_64_shrd_ndd<nf_name>): Ditto. (x86_64_shrd_1<nf_name>): Ditto. (x86_64_shrd_ndd_1<nf_name>): Ditto. (*x86_64_shrd_shld_1_nozext_nf): Ditto. (x86_shrd<nf_name>): Ditto. (x86_shrd_ndd<nf_name>): Ditto. (x86_shrd_1<nf_name>): Ditto. (x86_shrd_ndd_1<nf_name>): Ditto. (*x86_shrd_shld_1_nozext_nf): Ditto.
2024-06-03[APX NF] Support APX NF for rotate insnsLingling Kong2-21/+43
gcc/ChangeLog: * config/i386/i386.md (ashr<mode>3_cvt<nf_name>): New define_insn. (*<insn><mode>3_1<nf_name>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-nf.c: Add test.
2024-06-03[APX NF] Support APX NF for right shift insnsLingling Kong1-36/+46
gcc/ChangeLog: * config/i386/i386.md (*ashr<mode>3_1<nf_name>): New define_insn. (*lshr<mode>3_1<nf_name>): Ditto. (*lshrqi3_1<nf_name>): Ditto. (*lshrhi3_1<nf_name>): Ditto.
2024-06-03[APX NF] Support APX NF for left shift insnsLingling Kong2-26/+83
gcc/ChangeLog: * config/i386/i386.md (*ashl<mode>3_1<nf_name>): New define_insn. (*ashlhi3_1<nf_name>): Ditto. (*ashlqi3_1<nf_name>): Ditto. * config/i386/sse.md: New define_split.
2024-06-03[APX NF] Support APX NF for {sub/and/or/xor/neg}Lingling Kong3-82/+114
gcc/ChangeLog: * config/i386/i386.md (nf_nonf_attr): New subst_attr. (nf_nonf_x64_attr): Ditto. (*sub<mode>_1<nf_name>): New define_insn. (*anddi_1<nf_name>): Ditto. (*and<mode>_1<nf_name>): Ditto. (*andqi_1<nf_name>): Ditto. (*<code><mode>_1<nf_name>): Ditto. (*<code>qi_1<nf_name>): Ditto. (*neg<mode>_1<nf_name>): Ditto. * config/i386/sse.md: New define_split. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-nf.c: New test.
2024-06-03[APX NF] Support APX NF addLingling Kong5-46/+100
APX NF(no flags) feature implements suppresses the update of status flags for arithmetic operations. For NF add, it is not clear whether nf add can be faster than lea. If so, the pattern needs to be adjusted to perfer lea generation. gcc/ChangeLog: * config/i386/i386-opts.h (enum apx_features): Add nf enumeration. * config/i386/i386.h (TARGET_APX_NF): New. * config/i386/i386.md (nf_name): New subst_att. (nf_prefix): Ditto. (nf_condition): Ditto. (nf_mem_constraint): Ditto. (nf_applied): Ditto. (nf_subst): Add new define_subst. (*add<mode>_1<nf_name>): New define_insn. (*addhi_1<nf_name>): Ditto. (*addqi_1<nf_name>): Diito. * config/i386/i386.opt: Add apx_nf enumeration. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-ndd.c: Fixed test. Co-authored-by: Hongyu Wong <hongyu.wang@intel.com>
2024-06-03i386: Optimize EQ/NE comparison between avx512 kmask and -1.Hu, Lin13-0/+422
Acheive EQ/NE comparison between avx512 kmask and -1 by using kxortest with checking CF. gcc/ChangeLog: PR target/113609 * config/i386/sse.md (*kortest_cmp<mode>_setcc): New define_insn_and_split. (*kortest_cmp<mode>_jcc): Ditto. gcc/testsuite/ChangeLog: PR target/113609 * gcc.target/i386/pr113609-1.c: New test. * gcc.target/i386/pr113609-2.c: Ditto.
2024-06-03Daily bump.GCC Administrator3-1/+12
2024-06-02Fix PR c++/109958: ICE taking the address of bound static member function ↵Simon Martin2-0/+10
brought into derived class by using-declaration We currently ICE upon the following because we don't properly handle the overload created for B::f through the using statement. === cut here === struct B { static int f(); }; struct D : B { using B::f; }; void f(D d) { &d.f; } === cut here === This patch makes build_class_member_access_expr and cp_build_addr_expr_1 handle such overloads, and fixes the PR. Successfully tested on x86_64-pc-linux-gnu. PR c++/109958 gcc/cp/ChangeLog: * typeck.cc (build_class_member_access_expr): Handle single OVERLOADs. (cp_build_addr_expr_1): Likewise. gcc/testsuite/ChangeLog: * g++.dg/overload/using6.C: New test.
2024-06-02Daily bump.GCC Administrator5-1/+135
2024-06-01analyzer: detect -Wanalyzer-allocation-size at call stmts [PR106203]David Malcolm15-42/+216
gcc/analyzer/ChangeLog: PR analyzer/106203 * checker-event.h: Include "analyzer/event-loc-info.h". (struct event_loc_info): Move to its own header file. * diagnostic-manager.cc (diagnostic_manager::emit_saved_diagnostic): Move creation of event_loc_info here from add_final_event, and if we have a stmt_finder, call its update_event_loc_info method. * engine.cc (leak_stmt_finder::update_event_loc_info): New. (exploded_node::detect_leaks): Likewise. (exploded_node::detect_leaks): Pass nullptr as call_stmt arg to region_model::pop_frame. * event-loc-info.h: New file, with content taken from checker-event.h. * exploded-graph.h (stmt_finder::update_event_loc_info): New pure virtual function. * infinite-loop.cc (infinite_loop_diagnostic::add_final_event): Update for change to vfunc signature. * infinite-recursion.cc (infinite_recursion_diagnostic::add_final_event): Likewise. * pending-diagnostic.cc (pending_diagnostic::add_final_event): Pass in the event_loc_info from the caller, rather than generating it from a gimple stmt and enode. * pending-diagnostic.h (pending_diagnostic::add_final_event): Likewise. * region-model.cc (region_model::on_longjmp): Pass nullptr as call_stmt arg to region_model::pop_frame. (region_model::update_for_return_gcall): Likewise, but pass call_stmt. (class caller_context): New. (region_model::pop_frame): Add "call_stmt" argument. Use it and the frame_region with a caller_context when setting result_dst_reg's value so that any diagnostic is reported at the call stmt in the caller. (selftest::test_stack_frames): Pass nullptr as call_stmt arg to region_model::pop_frame. (selftest::test_alloca): Likewise. * region-model.h (region_model::pop_frame): Add "call_stmt" argument. gcc/testsuite/ChangeLog: PR analyzer/106203 * c-c++-common/analyzer/allocation-size-1.c (test_9): Remove xfail. * c-c++-common/analyzer/allocation-size-2.c (test_8): Likewise. * gcc.dg/analyzer/allocation-size-multiline-4.c: New test. * gcc.dg/plugin/analyzer_cpython_plugin.c (refcnt_stmt_finder::update_event_loc_info): New. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-06-01AVR: target/115317 - Make isinf(-Inf) return -1.Georg-Johann Lay1-0/+55
PR target/115317 libgcc/config/avr/libf7/ * libf7-asm.sx (__isinf): Map -Inf to -1. gcc/testsuite/ * gcc.target/avr/torture/pr115317-isinf.c: New test.
2024-06-01AVR: tree-optimization/115307 - Work around isinf bloat from early passes.Georg-Johann Lay2-0/+37
PR tree-optimization/115307 gcc/ * config/avr/avr.md (SFDF): New mode iterator. (isinf<mode>2) [sf, df]: New expanders. gcc/testsuite/ * gcc.target/avr/torture/pr115307-isinf.c: New test.
2024-05-31[to-be-committed] [RISC-V] Use Zbkb for general 64 bit constants when profitableJeff Law2-16/+110
Basically this adds the ability to generate two independent constants during synthesis, then bring them together with a pack instruction. Thus we never need to go out to the constant pool when zbkb is enabled. The worst sequence we ever generate is lui+addi+lui+addi+pack Obviously if either half can be synthesized with just a lui or just an addi, then we'll DTRT automagically. So for example: unsigned long foo_0xf857f2def857f2de(void) { return 0x1425000028000000; } The high and low halves are just a lui. So the final synthesis is: > li a5,671088640 # 15 [c=4 l=4] *movdi_64bit/1 > li a0,337969152 # 16 [c=4 l=4] *movdi_64bit/1 > pack a0,a5,a0 # 17 [c=12 l=4] riscv_xpack_di_si_2 On the implementation side, I think the bits I've put in here likely can be used to handle the repeating constant case for !zbkb. I think it likely could be used to help capture cases where the upper half can be derived from the lower half (say by turning a bit on or off, shifting or something similar). The key in both of these cases is we need a temporary register holding an intermediate value. Ventana's internal tester enables zbkb, but I don't think any of the other testers currently exercise zbkb. We'll probably want to change that at some point, but I don't think it's super-critical yet. While I can envision a few more cases where we could improve constant synthesis, No immediate plans to work in this space, but if someone is interested, some thoughts are recorded here: > https://wiki.riseproject.dev/display/HOME/CT_00_031+--+Additional+Constant+Synthesis+Improvements gcc/ * config/riscv/riscv.cc (riscv_integer_op): Add new field. (riscv_build_integer_1): Initialize the new field. (riscv_built_integer): Recognize more cases where Zbkb's pack instruction is profitable. (riscv_move_integer): Loop over all the codes. If requested, save the current constant into a temporary. Generate pack for more cases using the saved constant. gcc/testsuite * gcc.target/riscv/synthesis-10.c: New test.
2024-06-01c++/modules: Fix revealing with using-decls [PR114867]Nathaniel Shead9-57/+161
This patch fixes a couple issues with the current handling of revealing declarations with using-decls. Firstly, doing 'remove_node' when handling function overload sets is not safe, because it not only mutates the OVERLOAD we're walking over but potentially any other references to this OVERLOAD that are cached from phase-1 template lookup. This causes the attached using-17 testcase to fail because the overload set in 'X::test()' no longer contains the 'ns::f(T)' template once instantiated at the end of the file. This patch works around this by simply not removing the old declaration. This does make the overload list potentially longer than it otherwise would have been, but only when re-exporting the same set of functions in a using-decl. Additionally, because 'ovl_insert' always prepends these newly inserted overloads, repeated exported using-decls won't continue to add declarations, as the first exported using-decl will be found before the original (unexported) declaration. Another, related, issue is that using-decls of GMF entities currently doesn't mark them as reachable unless they are also exported, and thus they may not be available in e.g. module implementation units. We solve this with a new flag on OVERLOADs set when they are declared within the module purview. This starts to run into the more general issue of handling using-decls of non-functions (see e.g. PR114863) but by just marking such GMF entities as purview we can work around this for now. This also allows us to get rid of the special-casing of exported using-decls in 'add_binding_entity', which was incorrect anyway: a non-exported using-decl still needs to be emitted anyway if it lives in the module purview, even if referring to a non-purview item. PR c++/114867 gcc/cp/ChangeLog: * cp-tree.h (OVL_PURVIEW_P): New. (ovl_iterator::purview_p): New. * module.cc (depset::hash::add_binding_entity): Only ignore entities not within module purview. Set OVL_PURVIEW_P on new OVERLOADs for emitted declarations. (module_state::read_cluster): Imported using-decls are always in purview, mark as OVL_PURVIEW_P. * name-lookup.h (enum WMB_Flags): New WMB_Purview flag. * name-lookup.cc (walk_module_binding): Set WMB_Purview as needed. (do_nonmember_using_decl): Don't remove from existing OVERLOADs. Also reveal non-exported decls. Also reveal 'extern "C"' decls. Add workaround to reveal non-function decls. * tree.cc (ovl_insert): Adjust to also set OVL_PURVIEW_P when needed. gcc/testsuite/ChangeLog: * g++.dg/modules/using-17_a.C: New test. * g++.dg/modules/using-17_b.C: New test. * g++.dg/modules/using-18_a.C: New test. * g++.dg/modules/using-18_b.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-06-01vect: Bind input vectype to lane-reducing operationFeng Xue1-10/+13
The input vectype is an attribute of lane-reducing operation, instead of reduction PHI that it is associated to, since there might be more than one lane-reducing operations with different type in a loop reduction chain. So bind each lane-reducing operation with its own input type. 2024-05-29 Feng Xue <fxue@os.amperecomputing.com> gcc/ * tree-vect-loop.cc (vect_is_emulated_mixed_dot_prod): Remove parameter loop_vinfo. Get input vectype from stmt_info instead of reduction PHI. (vect_model_reduction_cost): Remove loop_vinfo argument of call to vect_is_emulated_mixed_dot_prod. (vect_transform_reduction): Likewise. (vectorizable_reduction): Likewise, and bind input vectype to lane-reducing operation.
2024-06-01vect: Split out partial vect checking for reduction into a functionFeng Xue1-60/+77
Partial vectorization checking for vectorizable_reduction is a piece of relatively isolated code, which may be reused by other places. Move the code into a new function for sharing. 2024-05-29 Feng Xue <fxue@os.amperecomputing.com> gcc/ * tree-vect-loop.cc (vect_reduction_update_partial_vector_usage): New function. (vectorizable_reduction): Move partial vectorization checking code to vect_reduction_update_partial_vector_usage.
2024-06-01vect: Add a function to check lane-reducing codeFeng Xue3-20/+19
Check if an operation is lane-reducing requires comparison of code against three kinds (DOT_PROD_EXPR/WIDEN_SUM_EXPR/SAD_EXPR). Add an utility function to make source coding for the check handy and concise. 2024-05-29 Feng Xue <fxue@os.amperecomputing.com> gcc/ * tree-vectorizer.h (lane_reducing_op_p): New function. * tree-vect-slp.cc (vect_analyze_slp): Use new function lane_reducing_op_p to check statement code. * tree-vect-loop.cc (vect_transform_reduction): Likewise. (vectorizable_reduction): Likewise, and change name of a local variable that holds the result flag.
2024-06-01Daily bump.GCC Administrator5-1/+383
2024-05-31xtensa: Prepend "(use A0_REG)" to sibling call CALL_INSN_FUNCTION_USAGE ↵Takayuki 'January June' Suwa3-14/+19
instead of emitting it as insn at the end of epilogue No functional changes. gcc/ChangeLog: * config/xtensa/xtensa-protos.h (xtensa_expand_call): Add the third argument as boolean. (xtensa_expand_epilogue): Remove the first argument. * config/xtensa/xtensa.cc (xtensa_expand_call): Add the third argument "sibcall_p", and modify in order to prepend "(use A0_REG)" to CALL_INSN_FUNCTION_USAGE if the argument is true. (xtensa_expand_epilogue): Remove the first argument "sibcall_p" and its conditional clause. * config/xtensa/xtensa.md (call, call_value, sibcall, sibcall_value): Append a boolean value to the argument of xtensa_expand_call() indicating whether it is sibling call or not. (epilogue): Remove the boolean argument from xtensa_expand_epilogue(), and then append emitting "(return)". (sibcall_epilogue): Remove the boolean argument from xtensa_expand_epilogue().
2024-05-31xtensa: Simplify several MD templatesTakayuki 'January June' Suwa4-100/+43
No functional changes. gcc/ChangeLog: * config/xtensa/predicates.md (subreg_HQI_lowpart_operator, xtensa_sminmax_operator): New operator predicates. * config/xtensa/xtensa-protos.h (xtensa_match_CLAMPS_imms_p): Remove. * config/xtensa/xtensa.cc (xtensa_match_CLAMPS_imms_p): Ditto. * config/xtensa/xtensa.md (*addsubx, *extzvsi-1bit_ashlsi3, *extzvsi-1bit_addsubx): Revise the output statements by conditional ternary operator rather than switch-case clause in order to avoid using gcc_unreachable(). (xtensa_clamps): Reduce to a single pattern definition using the predicate added above. (Some split patterns to assist *masktrue_const_bitcmpl): Ditto.
2024-05-31RISC-V: Remove dead perm series code and document.Robin Dapp1-22/+4
With the introduction of shuffle_series_patterns the explicit handler code for a perm series is dead. This patch removes it and also adds a function-level comment to shuffle_series_patterns. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Document. (shuffle_extract_and_slide1up_patterns): Remove.
2024-05-31RISC-V: Add vector popcount, clz, ctz.Robin Dapp15-81/+272
This patch adds the zvbb vcpop, vclz and vctz to the autovec machinery as well as tests for them. gcc/ChangeLog: * config/riscv/autovec.md (ctz<mode>2): New expander. (clz<mode>2): Ditto. * config/riscv/generic-vector-ooo.md: Add bitmanip ops to insn reservation. * config/riscv/vector-crypto.md: Add VLS modes to insns. * config/riscv/vector.md: Add bitmanip ops to mode_idx and other attributes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/unop/popcount-1.c: Adjust check for zvbb. * gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/popcount-2.c: Ditto. * gcc.target/riscv/rvv/autovec/unop/popcount-3.c: New test. * gcc.target/riscv/rvv/autovec/unop/popcount-template.h: New test. * gcc.target/riscv/rvv/autovec/unop/clz-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/clz-run.c: New test. * gcc.target/riscv/rvv/autovec/unop/clz-template.h: New test. * gcc.target/riscv/rvv/autovec/unop/ctz-1.c: New test. * gcc.target/riscv/rvv/autovec/unop/ctz-run.c: New test. * gcc.target/riscv/rvv/autovec/unop/ctz-template.h: New test.
2024-05-31RISC-V: Add vandn combine helper.Robin Dapp5-1/+119
This patch adds a combine pattern for vandn as well as tests for it. gcc/ChangeLog: * config/riscv/autovec-opt.md (*vandn_<mode>): New pattern. * config/riscv/vector.md: Add vandn to mode_idx. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vandn-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vandn-run.c: New test. * gcc.target/riscv/rvv/autovec/binop/vandn-template.h: New test.
2024-05-31RISC-V: Use widening shift for scatter/gather if applicable.Robin Dapp5-18/+193
With the zvbb extension we can emit a widening shift for scatter/gather index preparation in case we need to multiply by 2 and zero extend. The patch also adds vwsll to the mode_idx attribute and removes the mode from shift-count operand of the insn pattern. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_gather_scatter): Use vwsll if applicable. * config/riscv/vector-crypto.md: Remove mode from vwsll shift count operator. * config/riscv/vector.md: Add vwsll to mode iterator. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add zvbb. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-12-zvbb.c: New test.
2024-05-31RISC-V: Add vwsll combine helpers.Robin Dapp5-3/+251
This patch enables the usage of vwsll in autovec context by adding the necessary combine patterns and tests. gcc/ChangeLog: * config/riscv/autovec-opt.md (*vwsll_zext1_<mode>): New pattern. (*vwsll_zext2_<mode>): Ditto. (*vwsll_zext1_scalar_<mode>): Ditto. (*vwsll_zext1_trunc_<mode>): Ditto. (*vwsll_zext2_trunc_<mode>): Ditto. (*vwsll_zext1_trunc_scalar_<mode>): Ditto. * config/riscv/vector-crypto.md: Make pattern similar to other narrowing/widening patterns. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vwsll-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vwsll-run.c: New test. * gcc.target/riscv/rvv/autovec/binop/vwsll-template.h: New test.
2024-05-31RISC-V: Split vwadd.wx and vwsub.wx and add helpers.Robin Dapp4-32/+128
vwadd.wx and vwsub.wx have the same problem vfwadd.wf had. This patch splits the insn pattern in the same way vfwadd.wf was split. It also adds two patterns to recognize extended scalars. In practice those do not provide a lot of improvement over what we already have but in some instances we can get rid of redundant extensions. gcc/ChangeLog: * config/riscv/vector.md: Split vwadd.wx/vwsub.wx pattern and add extended_scalar patterns. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr115068.c: Add vwadd.wx/vwsub.wx tests. * gcc.target/riscv/rvv/base/pr115068-run.c: Include pr115068.c. * gcc.target/riscv/rvv/base/vwaddsub-1.c: New test.
2024-05-31RISC-V: Do not allow v0 as dest when merging [PR115068].Robin Dapp3-10/+67
This patch splits the vfw...wf pattern so we do not emit e.g. vfwadd.wf v0,v8,fa5,v0.t anymore. gcc/ChangeLog: PR target/115068 * config/riscv/vector.md: Split vfw<insn>.wf pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr115068-run.c: New test. * gcc.target/riscv/rvv/base/pr115068.c: New test.
2024-05-31aarch64: testsuite: Explicitly add -mlittle-endian to vget_low_2.cPengxuan Zheng1-1/+1
vget_low_2.c is a test case for little-endian, but we missed the -mlittle-endian flag in r15-697-ga2e4fe5a53cf75. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vget_low_2.c: Add -mlittle-endian. Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
2024-05-31Add the 6th argument to .ACCESS_WITH_SIZEQing Zhao4-9/+66
to carry the TYPE of the flexible array. Such information is needed during tree-object-size.cc. We cannot use the result type or the type of the 1st argument of the routine .ACCESS_WITH_SIZE to decide the element type of the original array due to possible type casting in the source code. gcc/c/ChangeLog: * c-typeck.cc (build_access_with_size_for_counted_by): Add the 6th argument to .ACCESS_WITH_SIZE. gcc/ChangeLog: * tree-object-size.cc (access_with_size_object_size): Use the type of the 6th argument for the type of the element. * internal-fn.cc (expand_ACCESS_WITH_SIZE): Update the comment with the 6th argument. gcc/testsuite/ChangeLog: * gcc.dg/flex-array-counted-by-6.c: New test.
2024-05-31Use the .ACCESS_WITH_SIZE in bound sanitizer.Qing Zhao5-0/+201
gcc/c-family/ChangeLog: * c-ubsan.cc (get_bound_from_access_with_size): New function. (ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE. gcc/testsuite/ChangeLog: * gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test. * gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test. * gcc.dg/ubsan/flex-array-counted-by-bounds-4.c: New test. * gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
2024-05-31Use the .ACCESS_WITH_SIZE in builtin object size.Qing Zhao5-0/+360
gcc/ChangeLog: * tree-object-size.cc (access_with_size_object_size): New function. (call_object_size): Call the new function. gcc/testsuite/ChangeLog: * gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT. * gcc.dg/flex-array-counted-by-3.c: New test. * gcc.dg/flex-array-counted-by-4.c: New test. * gcc.dg/flex-array-counted-by-5.c: New test.
2024-05-31Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.Qing Zhao8-7/+328
Including the following changes: * The definition of the new internal function .ACCESS_WITH_SIZE in internal-fn.def. * C FE converts every reference to a FAM with a "counted_by" attribute to a call to the internal function .ACCESS_WITH_SIZE. (build_component_ref in c_typeck.cc) This includes the case when the object is statically allocated and initialized. In order to make this working, the routine digest_init in c-typeck.cc is updated to fold calls to .ACCESS_WITH_SIZE to its first argument when require_constant is TRUE. However, for the reference inside "offsetof", the "counted_by" attribute is ignored since it's not useful at all. (c_parser_postfix_expression in c/c-parser.cc) In addtion to "offsetof", for the reference inside operator "typeof" and "alignof", we ignore counted_by attribute too. When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE, replace the call with its first argument. * Convert every call to .ACCESS_WITH_SIZE to its first argument. (expand_ACCESS_WITH_SIZE in internal-fn.cc) * Provide the utility routines to check the call is .ACCESS_WITH_SIZE and get the reference from the call to .ACCESS_WITH_SIZE. (is_access_with_size_p and get_ref_from_access_with_size in tree.cc) gcc/c/ChangeLog: * c-parser.cc (c_parser_postfix_expression): Ignore the counted-by attribute when build_component_ref inside offsetof operator. * c-tree.h (build_component_ref): Add one more parameter. * c-typeck.cc (build_counted_by_ref): New function. (build_access_with_size_for_counted_by): New function. (build_component_ref): Check the counted-by attribute and build call to .ACCESS_WITH_SIZE. (build_unary_op): When building ADDR_EXPR for .ACCESS_WITH_SIZE, use its first argument. (lvalue_p): Accept call to .ACCESS_WITH_SIZE. (digest_init): Fold call to .ACCESS_WITH_SIZE to its first argument when require_constant is TRUE. gcc/ChangeLog: * internal-fn.cc (expand_ACCESS_WITH_SIZE): New function. * internal-fn.def (ACCESS_WITH_SIZE): New internal function. * tree.cc (is_access_with_size_p): New function. (get_ref_from_access_with_size): New function. * tree.h (is_access_with_size_p): New prototype. (get_ref_from_access_with_size): New prototype. gcc/testsuite/ChangeLog: * gcc.dg/flex-array-counted-by-2.c: New test.
2024-05-31Provide counted_by attribute to flexible array member fieldQing Zhao10-21/+444
'counted_by (COUNT)' The 'counted_by' attribute may be attached to the C99 flexible array member of a structure. It indicates that the number of the elements of the array is given by the field "COUNT" in the same structure as the flexible array member. GCC may use this information to improve detection of object size information for such structures and provide better results in compile-time diagnostics and runtime features like the array bound sanitizer and the '__builtin_dynamic_object_size'. For instance, the following code: struct P { size_t count; char other; char array[] __attribute__ ((counted_by (count))); } *p; specifies that the 'array' is a flexible array member whose number of elements is given by the field 'count' in the same structure. The field that represents the number of the elements should have an integer type. Otherwise, the compiler reports an error and ignores the attribute. When the field that represents the number of the elements is assigned a negative integer value, the compiler treats the value as zero. An explicit 'counted_by' annotation defines a relationship between two objects, 'p->array' and 'p->count', and there are the following requirementthat on the relationship between this pair: * 'p->count' must be initialized before the first reference to 'p->array'; * 'p->array' has _at least_ 'p->count' number of elements available all the time. This relationship must hold even after any of these related objects are updated during the program. It's the user's responsibility to make sure the above requirements to be kept all the time. Otherwise the compiler reports warnings, at the same time, the results of the array bound sanitizer and the '__builtin_dynamic_object_size' is undefined. One important feature of the attribute is, a reference to the flexible array member field uses the latest value assigned to the field that represents the number of the elements before that reference. For example, p->count = val1; p->array[20] = 0; // ref1 to p->array p->count = val2; p->array[30] = 0; // ref2 to p->array in the above, 'ref1' uses 'val1' as the number of the elements in 'p->array', and 'ref2' uses 'val2' as the number of elements in 'p->array'. gcc/c-family/ChangeLog: * c-attribs.cc (handle_counted_by_attribute): New function. (attribute_takes_identifier_p): Add counted_by attribute to the list. * c-common.cc (c_flexible_array_member_type_p): ...To this. * c-common.h (c_flexible_array_member_type_p): New prototype. gcc/c/ChangeLog: * c-decl.cc (flexible_array_member_type_p): Renamed and moved to... (add_flexible_array_elts_to_size): Use renamed function. (is_flexible_array_member_p): Use renamed function. (verify_counted_by_attribute): New function. (finish_struct): Use renamed function and verify counted_by attribute. * c-tree.h (lookup_field): New prototype. * c-typeck.cc (lookup_field): Expose as extern function. (tagged_types_tu_compatible_p): Check counted_by attribute for structure type. gcc/ChangeLog: * doc/extend.texi: Document attribute counted_by. gcc/testsuite/ChangeLog: * gcc.dg/flex-array-counted-by.c: New test. * gcc.dg/flex-array-counted-by-7.c: New test. * gcc.dg/flex-array-counted-by-8.c: New test.
2024-05-31alpha: Fix invalid RTX in divmodsi insn patterns [PR115297]Uros Bizjak3-10/+26
any_divmod instructions are modelled with invalid RTX: [(set (match_operand:DI 0 "register_operand" "=c") (sign_extend:DI (match_operator:SI 3 "divmod_operator" [(match_operand:DI 1 "register_operand" "a") (match_operand:DI 2 "register_operand" "b")]))) (clobber (reg:DI 23)) (clobber (reg:DI 28))] where SImode divmod_operator (div,mod,udiv,umod) has DImode operands. Wrap input operand with truncate:SI to make machine modes consistent. PR target/115297 gcc/ChangeLog: * config/alpha/alpha.md (<any_divmod:code>si3): Wrap DImode operands 3 and 4 with truncate:SI RTX. (*divmodsi_internal_er): Ditto for operands 1 and 2. (*divmodsi_internal_er_1): Ditto. (*divmodsi_internal): Ditto. * config/alpha/constraints.md ("b"): Correct register number in the description. gcc/testsuite/ChangeLog: * gcc.target/alpha/pr115297.c: New test.
2024-05-31nvptx target: Global constructor, destructor support, via nvptx-tools 'ld'Thomas Schwinge3-4/+15
The function attributes 'constructor', 'destructor', and 'init_priority' now work, as do the C++ features making use of this. Test cases with effective target 'global_constructor' and 'init_priority' now generally work, and 'check-gcc-c++' test results greatly improve; no more "sorry, unimplemented: global constructors not supported on this target". For proper execution test results, this depends on <https://github.com/SourceryTools/nvptx-tools/commit/96f8fc59a757767b9e98157d95c21e9fef22a93b> "ld: Global constructor/destructor support". gcc/ * config/nvptx/nvptx.h: Configure global constructor, destructor support. gcc/testsuite/ * gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is 'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper in identifiers. * lib/target-supports.exp (check_effective_target_global_constructor): Enable for nvptx. libgcc/ * config/nvptx/crt0.c (__gbl_ctors): New weak function. (__main): Invoke it. * config/nvptx/gbl-ctors.c: New. * config/nvptx/t-nvptx: Configure global constructor, destructor support.
2024-05-31tree-optimization/115278 - fix DSE in if-conversion wrt volatilesRichard Biener2-1/+41
The following adds the missing guard for volatile stores to the embedded DSE in the loop if-conversion pass. PR tree-optimization/115278 * tree-if-conv.cc (ifcvt_local_dce): Do not DSE volatile stores. * g++.dg/vect/pr115278.cc: New testcase.
2024-05-31fix: valid compiler optimization may fail the testMarc Poulhiès1-0/+12
cxa4001 may fail with "Exception not raised" when the compiler omits the calls to To_Mapping, in accordance with 10.2.1(18/3): "If a library unit is declared pure, then the implementation is permitted to omit a call on a library-level subprogram of the library unit if the results are not needed after the call" Using the result of both To_Mapping calls prevents the compiler from omitting them. "The corrected test will be available on the ACAA web site (http://www.ada-auth.org/), and will be issued with the Modified Tests List version 2.6K, 3.1DD, and 4.1GG." gcc/testsuite/ChangeLog: * ada/acats/tests/cxa/cxa4001.a: Use function result.
2024-05-31build: Include minor version in config.gcc unsupported messageRainer Orth1-2/+2
It has been pointed out to me that when moving Solaris 11.3 from config.gcc's obsolete to unsupported list, I'd forgotten to also move the minor version info, leading to confusing *** Configuration i386-pc-solaris2.11 not supported instead of the correct *** Configuration i386-pc-solaris2.11.3 not supported This patch fixes this oversight. Tested on i386-pc-solaris2.11 (11.3 and 11.4). 2024-05-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc: * config.gcc: Move ${target_min} from obsolete to unsupported message.