aboutsummaryrefslogtreecommitdiff
path: root/gcc/config/riscv
AgeCommit message (Collapse)AuthorFilesLines
2024-11-19[RISC-V][PR target/117649] Fix branch on masked values splitterJeff Law1-1/+1
Andreas reported GCC mis-compiled GAS for risc-v Thankfully he also reduced it to a nice little testcase. So the whole point of the pattern in question is to "reduce" the constants by right shifting away common unnecessary bits in RTL expressions like this: > [(set (pc) > (if_then_else (any_eq > (and:ANYI (match_operand:ANYI 1 "register_operand" "r") > (match_operand 2 "shifted_const_arith_operand" "i")) > (match_operand 3 "shifted_const_arith_operand" "i")) > (label_ref (match_operand 0 "" "")) > (pc))) When applicable, the reduced constants in operands 2/3 fit into a simm12 and thus do not need multi-instruction synthesis. Note that we have to also shift operand 1. That shift should have been an arithmetic shift, but was incorrectly coded as a logical shift. Fixed with the obvious change on the right shift opcode. Expecting to push to the trunk once the pre-commit tester renders its verdict. I've already tested in this my tester for rv32 and rv64. PR target/117649 gcc/ * config/riscv/riscv.md (branch on masked/shifted operands): Use arithmetic rather than logical shift for operand 1. gcc/testsuite * gcc.target/riscv/branch-1.c: Update expected output. * gcc.target/riscv/pr117649.c: New test.
2024-11-19RISC-V: Tie MUL and DIV masks to the M extensionDimitar Dimitrov1-1/+5
When configuring GCC for RV32EC with: ./configure \ --target=riscv32-none-elf \ --with-multilib-generator="rv32ec-ilp32e--" \ --with-abi=ilp32e \ --with-arch=rv32ec Then the build fails because division is erroneously left enabled: cc1: error: '-mdiv' requires '-march' to subsume the 'M' extension -fself-test: 8412281 pass(es) in 0.647173 seconds Fix by disabling MASK_DIV if multiplication is not available and -mdiv option has not been explicitly passed. Tested the above RV32EC-only toolchain using the GNU simulator: === gcc Summary === # of expected passes 211635 # of unexpected failures 3004 # of expected failures 1061 # of unresolved testcases 5651 # of unsupported tests 18958 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_override_options_internal): Set division option's default to disabled if multiplication is not available. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2024-11-19RISC-V: Load VLS perm indices directly from memory.Robin Dapp1-2/+20
Instead of loading the permutation indices and using vmslt in order to determine which elements belong to which source vector we can compute the proper mask at compile time. That way we can emit vlm instead of vle + vmslt. gcc/ChangeLog: * config/riscv/riscv-v.cc (shuffle_merge_patterns): Load VLS indices directly. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/merge-1.c: Check for vlm and no vmsleu etc. * gcc.target/riscv/rvv/autovec/vls/merge-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/merge-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/merge-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/merge-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/merge-6.c: Ditto.
2024-11-18[committed][RISC-V][PR target/117595] Fix bogus use of simplify_gen_subregJeff Law2-2/+2
And stage3 begins... Zdenek's fuzzer caught this one. Essentially using simplify_gen_subreg directly with an offset of 0 when we just needed a lowpart. The offset of 0 works for little endian, but for big endian it's simply wrong. simplify_gen_subreg will return NULL_RTX because the case isn't representable. We then embed that NULL_RTX into an insn that's later scanned during mark_jump_label. Scanning the port I see a couple more instances of this incorrect idiom. One is pretty obvious to fix. The others look a bit goofy and I'll probably need to sync with Patrick on them. Anyway tested on riscv64-elf and riscv32-elf with no regressions. Pushing to the trunk. PR target/117595 gcc/ * config/riscv/sync.md (atomic_compare_and_swap<mode>): Use gen_lowpart rather than simplify_gen_subreg. * config/riscv/riscv.cc (riscv_legitimize_move): Similarly. gcc/testsuite/ * gcc.target/riscv/pr117595.c: New test.
2024-11-18RISC-V: Add VLS modes to strided loads.Robin Dapp3-13/+256
This patch adds VLS modes to the strided load expanders. gcc/ChangeLog: * config/riscv/autovec.md: Add VLS modes. * config/riscv/vector-iterators.md: Ditto. * config/riscv/vector.md: Ditto.
2024-11-18RISC-V: Add else operand to masked loads [PR115336].Robin Dapp3-30/+53
This patch adds else operands to masked loads. Currently the default else operand predicate just accepts "undefined" (i.e. SCRATCH) values. PR middle-end/115336 PR middle-end/116059 gcc/ChangeLog: * config/riscv/autovec.md: Add else operand. * config/riscv/predicates.md (maskload_else_operand): New predicate. * config/riscv/riscv-v.cc (get_else_operand): Remove static. (expand_load_store): Use get_else_operand and adjust index. (expand_gather_scatter): Ditto. (expand_lanes_load_store): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr115336.c: New test. * gcc.target/riscv/rvv/autovec/pr116059.c: New test.
2024-11-14[RISC-V][V2] Fix type on vector move patternsJeff Law1-2/+8
Updated version of my prior patch to fix type attributes on the pre-allocation vector move pattern. This version just adds a suitable set of attributes to a second pattern that was obviously wrong. Passed on my tester for rv64 and rv32 crosses. Bootstrapped and regression tested on riscv64-linux-gnu as well. -- So I was looking into a horrific schedule for SAD a week or so ago and came across this gem. Basically we were treating a vector load as a vector move from a scheduling standpoint during sched1. Naturally we didn't expose much ILP during sched1. That in turn caused the register allocator to pack the pseudos onto the physical vector registers tightly. regrename didn't do anything useful and the resulting code had too many false dependencies for sched2 to do anything useful. As a result we were taking many load->use stalls in x264's SAD routine. I'm confident the types are fine, but I'm a lot less sure about the other attributes (mode, avl_type_index, mode_idx). If someone could take a look at that, it'd be greatly appreciated. There's other cases that may need similar treatment. But I didn't want to muck with them until I understood those other attributes and how they need adjustments. In particular mov<VLS_AVL_REG:mode><P:mode>_lra appears to have the same problem. -- gcc/ * config/riscv/vector.md (mov<mode> pattern/splitter): Fix type and other attributes. (mov<VLS_AVL_REG:mode><P:mode>_lra): Likewise.
2024-11-13[PATCH] RISC-V: Bugfix for unrecognizable insn for XTheadVectorJin Ma1-2/+2
error: unrecognizable insn: (insn 35 34 36 2 (set (subreg:RVVM1SF (reg/v:RVVM1x4SF 142 [ _r ]) 0) (unspec:RVVM1SF [ (const_vector:RVVM1SF repeat [ (const_double:SF 0.0 [0x0.0p+0]) ]) (reg:DI 0 zero) (const_int 1 [0x1]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_TH_VWLDST)) -1 (nil)) during RTL pass: mode_sw PR target/116591 gcc/ChangeLog: * config/riscv/vector.md: Add restriction to call pred_th_whole_mov. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/xtheadvector/pr116591.c: New test.
2024-11-13RISC-V: Implement TARGET_GENERATE_VERSION_DISPATCHER_BODY and ↵Yangyu Chen1-0/+587
TARGET_GET_FUNCTION_VERSIONS_DISPATCHER This patch implements the TARGET_GENERATE_VERSION_DISPATCHER_BODY and TARGET_GET_FUNCTION_VERSIONS_DISPATCHER for RISC-V. This is used to generate the dispatcher function and get the dispatcher function for function multiversioning. This patch copies many codes from commit 0cfde688e213 ("[aarch64] Add function multiversioning support") and modifies them to fit the RISC-V port. A key difference is the data structure of feature bits in RISC-V C-API is a array of unsigned long long, while in AArch64 is not a array. So we need to generate the array reference for each feature bits element in the dispatcher function. Signed-off-by: Yangyu Chen <cyy@cyyself.name> gcc/ChangeLog: * config/riscv/riscv.cc (add_condition_to_bb): New function. (dispatch_function_versions): New function. (get_suffixed_assembler_name): New function. (make_resolver_func): New function. (riscv_generate_version_dispatcher_body): New function. (riscv_get_function_versions_dispatcher): New function. (TARGET_GENERATE_VERSION_DISPATCHER_BODY): Implement it. (TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): Implement it.
2024-11-13RISC-V: Implement TARGET_MANGLE_DECL_ASSEMBLER_NAMEYangyu Chen1-0/+39
This patch implements the TARGET_MANGLE_DECL_ASSEMBLER_NAME for RISC-V. This is used to add function multiversioning suffixes to the assembler name. Signed-off-by: Yangyu Chen <cyy@cyyself.name> gcc/ChangeLog: * config/riscv/riscv.cc (riscv_mangle_decl_assembler_name): New function. (TARGET_MANGLE_DECL_ASSEMBLER_NAME): Define.
2024-11-13RISC-V: Implement TARGET_COMPARE_VERSION_PRIORITY and ↵Yangyu Chen1-0/+127
TARGET_OPTION_FUNCTION_VERSIONS This patch implements TARGET_COMPARE_VERSION_PRIORITY and TARGET_OPTION_FUNCTION_VERSIONS for RISC-V. The TARGET_COMPARE_VERSION_PRIORITY is implemented to compare the priority of two function versions based on the rules defined in the RISC-V C-API Doc PR #85: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85/files#diff-79a93ca266139524b8b642e582ac20999357542001f1f4666fbb62b6fb7a5824R721 If multiple versions have equal priority, we select the function with the most number of feature bits generated by riscv_minimal_hwprobe_feature_bits. When it comes to the same number of feature bits, we diff two versions and select the one with the least significant bit set. Since a feature appears earlier in the feature_bits might be more important to performance. The TARGET_OPTION_FUNCTION_VERSIONS is implemented to check whether the two function versions are the same. This Implementation reuses the code in TARGET_COMPARE_VERSION_PRIORITY and check it returns 0, which means the equal priority. Co-Developed-by: Hank Chang <hank.chang@sifive.com> Signed-off-by: Yangyu Chen <cyy@cyyself.name> gcc/ChangeLog: * config/riscv/riscv.cc (parse_features_for_version): New function. (compare_fmv_features): New function. (riscv_compare_version_priority): New function. (riscv_common_function_versions): New function. (TARGET_COMPARE_VERSION_PRIORITY): Implement it. (TARGET_OPTION_FUNCTION_VERSIONS): Implement it.
2024-11-13RISC-V: Implement TARGET_OPTION_VALID_VERSION_ATTRIBUTE_PYangyu Chen4-13/+115
This patch implements the TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P for RISC-V. This hook is used to process attribute ((target_version ("..."))). As it is the first patch which introduces the target_version attribute, we also set TARGET_HAS_FMV_TARGET_ATTRIBUTE to 0 to use "target_version" for function versioning. Co-Developed-by: Hank Chang <hank.chang@sifive.com> Signed-off-by: Yangyu Chen <cyy@cyyself.name> gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_process_target_attr): Remove as it is not used. (riscv_option_valid_version_attribute_p): Declare. (riscv_process_target_version_attr): Declare. * config/riscv/riscv-target-attr.cc (riscv_target_attrs): Renamed from riscv_attributes. (riscv_target_version_attrs): New attributes for target_version. (riscv_process_one_target_attr): New arguments to select attrs. (riscv_process_target_attr): Likewise. (riscv_option_valid_attribute_p): Likewise. (riscv_process_target_version_attr): New function. (riscv_option_valid_version_attribute_p): New function. * config/riscv/riscv.cc (TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): Implement it. * config/riscv/riscv.h (TARGET_HAS_FMV_TARGET_ATTRIBUTE): Define it to 0 to use "target_version" for function versioning.
2024-11-13RISC-V: Implement riscv_minimal_hwprobe_feature_bitsYangyu Chen2-0/+49
This patch implements the riscv_minimal_hwprobe_feature_bits feature for the RISC-V target. The feature bits are defined in the libgcc/config/riscv/feature_bits.c to provide bitmasks of ISA extensions that defined in RISC-V C-API. Thus, we need a function to generate the feature bits for IFUNC resolver to dispatch between different functions based on the hardware features. The minimal feature bits means to use the earliest extension appeard in the Linux hwprobe to cover the given ISA string. To allow older kernels without some implied extensions probe to run the FMV dispatcher correctly. For example, V implies Zve32x, but Zve32x appears in the Linux kernel since v6.11. If we use isa string directly to generate FMV dispatcher with functions with "arch=+v" extension, since we have V implied the Zve32x, FMV dispatcher will check if the Zve32x extension is supported by the host. If the Linux kernel is older than v6.11, the FMV dispatcher will fail to detect the Zve32x extension even it already implies by the V extension, thus making the FMV dispatcher fail to dispatch the correct function. Thus, we need to generate the minimal feature bits to cover the given ISA string to allow the FMV dispatcher to work correctly on older kernels. Signed-off-by: Yangyu Chen <cyy@cyyself.name> gcc/ChangeLog: * common/config/riscv/riscv-common.cc (RISCV_EXT_BITMASK): New macro. (struct riscv_ext_bitmask_table_t): New struct. (riscv_minimal_hwprobe_feature_bits): New function. * common/config/riscv/riscv-ext-bitmask.def: New file. * config/riscv/riscv-subset.h (GCC_RISCV_SUBSET_H): Include riscv-feature-bits.h. (riscv_minimal_hwprobe_feature_bits): Declare the function. * config/riscv/riscv-feature-bits.h: New file.
2024-11-13RISC-V: Implement Priority syntax parser for Function Multi-VersioningYangyu Chen2-0/+27
This patch adds the priority syntax parser to support the Function Multi-Versioning (FMV) feature in RISC-V. This feature allows users to specify the priority of the function version in the attribute syntax. Chnages based on RISC-V C-API PR: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85 Signed-off-by: Yangyu Chen <cyy@cyyself.name> gcc/ChangeLog: * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::handle_priority): New function. (riscv_target_attr_parser::update_settings): Update priority attribute. * config/riscv/riscv.opt: Add TargetVariable riscv_fmv_priority.
2024-11-13Introduce TARGET_CLONES_ATTR_SEPARATOR for RISC-VYangyu Chen1-0/+5
Some architectures may use ',' in the attribute string, but it is not used as the separator for different targets. To avoid conflict, we introduce a new macro TARGET_CLONES_ATTR_SEPARATOR to separate different clones. As an example, according to RISC-V C-API Specification [1], RISC-V allows ',' in the attribute string in the "arch=" option to specify one more ISA extensions in the same target function, which conflict with the default separator to separate different clones. This patch introduces TARGET_CLONES_ATTR_SEPARATOR for RISC-V and choose '#' as the separator, since '#' is not allowed in the target_clones option string. [1] https://github.com/riscv-non-isa/riscv-c-api-doc/blob/c6c5d6d9cf96b342293315a5dff3d25e96ef8191/src/c-api.adoc#__attribute__targetattr-string Signed-off-by: Yangyu Chen <cyy@cyyself.name> gcc/ChangeLog: * defaults.h (TARGET_CLONES_ATTR_SEPARATOR): Define new macro. * multiple_target.cc (get_attr_str): Use TARGET_CLONES_ATTR_SEPARATOR to separate attributes. (separate_attrs): Likewise. (expand_target_clones): Likewise. * attribs.cc (attr_strcmp): Likewise. (sorted_attr_string): Likewise. * tree.cc (get_target_clone_attr_len): Likewise. * config/riscv/riscv.h (TARGET_CLONES_ATTR_SEPARATOR): Define TARGET_CLONES_ATTR_SEPARATOR for RISC-V. * doc/tm.texi: Document TARGET_CLONES_ATTR_SEPARATOR. * doc/tm.texi.in: Likewise.
2024-11-13RISC-V: Bugfix for max_sew_overlap_and_next_ratio_valid_for_prev_sew_p[pr117483]xuli1-2/+9
This patch fixs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117483 If prev and next satisfy the following rules, we should forbid the case (next.get_sew() < prev.get_sew() && (!next.get_ta() || !next.get_ma())) in the compatible function max_sew_overlap_and_next_ratio_valid_for_prev_sew_p. Otherwise, the tail elements of next will be polluted. DEF_SEW_LMUL_RULE (ge_sew, ratio_and_ge_sew, ratio_and_ge_sew, max_sew_overlap_and_next_ratio_valid_for_prev_sew_p, always_false, use_max_sew_and_lmul_with_next_ratio) Passed the rv64gcv full regression test. Signed-off-by: Li Xu <xuli1@eswincomputing.com> PR target/117483 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc: Fix bug. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr117483.c: New test.
2024-11-12[RISC-V] Fix costing of LO_SUM expressionsXianmiao Qu1-1/+2
This is a rewrite of a patch originally from Xianmiao Qu. Xianmiao noticed that the costs we compute for LO_SUM expressions was incorrect. Essentially we costed based solely on the first input to the LO_SUM. In a LO_SUM, the first input is almost always going to be a REG and thus isn't interesting. The second argument is almost always going to be some kind of symbolic operand, which is much more interesting from a costing standpoint. The right way to fix this is to sum the cost of the two operands. I've verified this produces the same code as Xianmiao's Qu's original patch. This has been tested on rv32 and rv64 in my tester. It missed today's bootstrap of riscv64 though :( Naturally I'll wait on the pre-commit CI tester to render a verdict, but I don't expect any problems. -- From Xianmiao Qu's original submission -- Currently, the cost of the LO_SUM expression is based on the cost of calculating the first subexpression. When the first subexpression is a register, the cost result will be zero. It seems a bit unreasonable for a SET expression to have a zero cost when its source is LO_SUM. Moreover, having a cost of zero for the expression will lead the loop invariant pass to calculate its benefits of being moved outside the loop as zero, thus preventing the out-of-loop placement of the loop invariant. As an example, consider the following test case: long a; long b[]; long *c; foo () { for (;;) *c = b[a]; } When compiling with -march=rv64gc -mabi=lp64d -Os, the following code is generated: .cfi_startproc lui a5,%hi(c) ld a4,%lo(c)(a5) lui a2,%hi(b) lui a1,%hi(a) .L2: ld a5,%lo(a)(a1) addi a3,a2,%lo(b) slli a5,a5,3 add a5,a5,a3 ld a5,0(a5) sd a5,0(a4) j .L2 After adjust the cost of the LO_SUM expression, the instruction addi will be moved outside the loop: .cfi_startproc lui a5,%hi(c) ld a3,%lo(c)(a5) lui a4,%hi(b) lui a2,%hi(a) addi a4,a4,%lo(b) .L2: ld a5,%lo(a)(a2) slli a5,a5,3 add a5,a5,a4 ld a5,0(a5) sd a5,0(a3) j .L2 gcc/ * config/riscv/riscv.cc (riscv_rtx_costs): Correct costing of LO_SUM expressions. Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
2024-11-12RISC-V: Add norelax function attributeyulong1-16/+28
This patch adds norelax function attribute that be discussed in riscv-c-api-doc PR#94. URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull/94 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_declare_function_name): Add new attribute.
2024-11-12[RISC-V] Drop undesirable two instruction macc alternativesJeff Law1-170/+140
So I was looking at sub_dct a little while ago and was surprised to see us emit two instructions out of a single pattern. We generally try to avoid that -- it's not always possible, but as a general rule of thumb it should be avoided. Specifically I saw: > vmv1r.v v4,v2 # 138 [c=4 l=4] *pred_mul_plusrvvm1hi_undef/5 > vmacc.vv v4,v8,v1 When we emit multiple instructions out of a single pattern we can't build a good schedule as we can't really describe the two instructions well and we can't split them up -- they move as an atomic unit. These cases can also raise correctness issues if the pattern doesn't properly account for both instructions in its length computation. Note the length, 4 bytes. So this is both a performance and latent correctness issue. It appears that these alternatives are meant to deal with the case when we have three source inputs and a non-matching output. The author did put in "?" to slightly disparage these alternatives, but a "!" would have been better. The best solution is to just remove those alternatives and let the allocator manage the matching operand issue. That's precisely what this patch does. For the various integer multiply-add/multiply-accumulate patterns we drop the alternatives which don't require a match between the output and one of the inputs. That fixes the correctness issue and should shave a cycle or two off our sub_dct code. Essentially the move bubbles up into an empty slot and we can schedule around the vmacc sensibly. Interestingly enough this fixes a scan-assembler test in my tester for both rv32 and rv64. > Tests that now work, but didn't before (10 tests): > > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 My BPI is already in a bootstrap test, so this patch won't hit the BPI for bootstrapping until Wednesday, meaning no data until Thursday. Will wait for the pre-commit tester though. gcc/ * config/riscv/vector.md (pred_mul_plus<mode>_undef): Drop alternatives where output doesn't have to match input. (pred_madd<mode>, pred_macc<mode>): Likewise. (pred_madd<mode>_scalar, pred_macc<mode>_scalar): Likewise. (pred_madd<mode>_exended_scalar): Likewise. (pred_macc<mode>_exended_scalar): Likewise. (pred_minus_mul<mode>_undef): Likewise. (pred_nmsub<mode>, pred_nmsac<mode>): Likewise. (pred_nmsub<mode>_scalar, pred_nmsac<mode>_scalar): Likewise. (pred_nmsub<mode>_exended_scalar): Likewise. (pred_nmsac<mode>_exended_scalar): Likewise.
2024-11-11RISC-V: Fix one nit indent issue of ustrunc pattern [NFC]Pan Li1-1/+1
Just notice the indent is not that right for ustrunc pattern from the md files. Thus, make it correct. It is somehow very obvious and will commit it after next 48H if no more comments. gcc/ChangeLog: * config/riscv/autovec.md: Fix indent format issue. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-11-04[PATCH v2 2/2] RISC-V: Disable by pieces for vector setmem length > ↵Craig Blackmore1-0/+19
UNITS_PER_WORD For fast unaligned access targets, by pieces uses up to UNITS_PER_WORD size pieces resulting in more store instructions than needed. For example gcc.target/riscv/rvv/base/setmem-2.c:f1 built with `-O3 -march=rv64gcv -mtune=thead-c906`: ``` f1: vsetivli zero,8,e8,mf2,ta,ma vmv.v.x v1,a1 vsetivli zero,0,e32,mf2,ta,ma sb a1,14(a0) vmv.x.s a4,v1 vsetivli zero,8,e16,m1,ta,ma vmv.x.s a5,v1 vse8.v v1,0(a0) sw a4,8(a0) sh a5,12(a0) ret ``` The slow unaligned access version built with `-O3 -march=rv64gcv` used 15 sb instructions: ``` f1: sb a1,0(a0) sb a1,1(a0) sb a1,2(a0) sb a1,3(a0) sb a1,4(a0) sb a1,5(a0) sb a1,6(a0) sb a1,7(a0) sb a1,8(a0) sb a1,9(a0) sb a1,10(a0) sb a1,11(a0) sb a1,12(a0) sb a1,13(a0) sb a1,14(a0) ret ``` After this patch, the following is generated in both cases: ``` f1: vsetivli zero,15,e8,m1,ta,ma vmv.v.x v1,a1 vse8.v v1,0(a0) ret ``` gcc/ChangeLog: * config/riscv/riscv.cc (riscv_use_by_pieces_infrastructure_p): New function. (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Define. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113469.c: Expect mf2 setmem. * gcc.target/riscv/rvv/base/setmem-2.c: Update f1 to expect straight-line vector memset. * gcc.target/riscv/rvv/base/setmem-3.c: Likewise.
2024-11-04[PATCH v2 1/2] RISC-V: Make vectorized memset handle more casesCraig Blackmore1-18/+19
`expand_vec_setmem` only generated vectorized memset if it fitted into a single vector store of at least (TARGET_MIN_VLEN / 8) bytes. Also, without dynamic LMUL the operation was always TARGET_MAX_LMUL even if it would have fitted a smaller LMUL. Allow vectorized memset to be generated for smaller lengths and smaller LMUL by switching to using use_vector_string_op. Smaller LMUL can be seen in setmem-3.c:f3. Smaller lengths will be seen after the second patch in this series which selectively disables by pieces. gcc/ChangeLog: * config/riscv/riscv-string.cc (use_vector_stringop_p): Add comment. (expand_vec_setmem): Use use_vector_stringop_p instead of check_vectorise_memory_operation. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/setmem-3.c: Expect smaller lmul.
2024-10-31RISC-V: fix const interleaved stepped vector with a scalar patternVineet Gupta1-3/+3
When bisecting for ICE in PR/117353, commit 771256bcb9dd ("RISC-V: Emit costs for bool and stepped const vectors") uncovered yet another latent issue (first noted [1]) [1] https://github.com/patrick-rivos/gcc-postcommit-ci/issues/1625 This patch fixes some of the fortran regressions from that report. Fixes 71a5ac6703d1 ("RISC-V: Support interleave vector with different step sequence") rv64imafdcv_zvl256b_zba_zbb_zbs_zicond/lp64d/medlow | # of unexpected case / # of unique unexpected case | gcc | g++ | gfortran | | 392 / 108 | 7 / 3 | 91 / 24 | | 392 / 108 | 7 / 3 | 67 / 12 | gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Use IOR op. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/slp-interleave-5.c: New test. Tested-by: Edwin Lu <ewlu@rivosinc.com> # Pre-commit CU #2503 Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
2024-10-31RISC-V: Do not inline when callee is versioned but caller is notYangyu Chen1-0/+4
When the callee is versioned but the caller is not, we should not inline the callee into the caller, to prevent the default version of the callee from being inlined into a not versioned caller. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_can_inline_p): Refuse to inline when callee is versioned but caller is not.
2024-10-31RISC-V: Split riscv_process_target_attr with const char *args argumentYangyu Chen2-28/+39
This patch splits static bool riscv_process_target_attr (tree args, location_t loc) into two functions: - bool riscv_process_target_attr (const char *args, location_t loc) - static bool riscv_process_target_attr (tree args, location_t loc) Thus, we can call `riscv_process_target_attr` with a `const char *` argument. This is useful for implementation of `target_version` attribute. gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_process_target_attr): New. * config/riscv/riscv-target-attr.cc (riscv_process_target_attr): Split into two functions with const char *args argument
2024-10-31RISC-V: allow -fno-plt to disable PLTYangyu Chen2-3/+3
Currently, the RISC-V target uses the target specific mplt option to control PLT generation. This patch deprecates the target specific mplt option and uses the common fplt option instead. This allows users to use the same option for most targets. Co-Developed-by: Liao Shihua <shihua@iscas.ac.cn> Signed-off-by: Yangyu Chen <cyy@cyyself.name> gcc/ChangeLog: * config/riscv/predicates.md: Use flag_plt instead of TARGET_PLT. * config/riscv/riscv.opt: alias common option fplt to mplt.
2024-10-30[RISC-V] Aggressively hoist VXRM assignmentsJeff Law1-0/+69
So a while back I was looking at pixel_avg for RISC-V where we try to use vaaddu for the halfword-ceiling-average step. The problem with vaaddu is that you must set VXRM to a suitable rounding mode as it has an undetermined state at function entry or after a function call. It turns out some designs will fully flush their pipelines on a write to VXRM which you can imagine is incredibly expensive. VXRM assignments are handled by an LCM based algorithm to find "optimal" placement points based on what insns in the stream need VXRM assignments and the particular mode they need. Unfortunately in pixel_avg an LCM algorithm only allows hoisting out of the innermost loop, but not the outer loop. The core issue is that LCM does not allow any speculation and there are paths which would bypass the inner loop (which don't actually trigger at runtime IIRC). The expectation is that VXRM assignments should be exceedingly rare and needing more than one mode even rarer. So hoisting more aggressively seems like a reasonable thing to do, but we don't want to burn too much time trying to do something fancy. So what this patch does is scan the IL once collecting any VXRM needs. If the current function has precisely one VXRM mode needed, then we pretend (for the sake of LCM) that the first instruction in the function also has that need. By doing so the VXRM assignment is essentially anticipated everywhere in the function. The standard LCM algorithm is run and has enough information to hoist the VXRM assignment more aggressively, most often to the prologue. This helps the BPI in a measurable way (IIRC it was 2-3%). It probably helps some of the SiFive designs, but I've been told they still benefit from the longer sequence of shifts & adds, hoisting just isn't enough for those designs. The Ventana design basically doesn't care where the VXRM assignment is. Point is we may want to have a tuning knob for the patterns which need VXRM (vaadd[u], vasub[u]) at some point in the near future. Bootstrapped and regression tested on riscv64 and regression tested on riscv32-elf and riscv64-elf. We've been using this internally for a while a while on spec as well. Obviously I'll wait for the pre-commit tester to do its thing. gcc/ * config/riscv/riscv.cc (singleton_vxrm_need): New function. (riscv_mode_needed): See if there is a singleton need and if so, claim it happens on the first insn in the chain.
2024-10-29[PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensionsyulong1-0/+84
gcc/ChangeLog: * config.gcc: Add riscv_cmo.h. * config/riscv/riscv_cmo.h: New file.
2024-10-29RISC-V: Implement the MASK_LEN_STRIDED_LOAD{STORE}Pan Li3-0/+83
This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in the RISC-V backend by leveraging the vector strided load/store insn. For example: void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stride] + 100; } Before this patch: 38 │ vsetvli a5,a3,e32,m1,ta,ma 39 │ vluxei64.v v1,(a1),v4 40 │ mul a4,a2,a5 41 │ sub a3,a3,a5 42 │ vadd.vv v1,v1,v2 43 │ vsuxei64.v v1,(a0),v4 44 │ add a1,a1,a4 45 │ add a0,a0,a4 After this patch: 33 │ vsetvli a5,a3,e32,m1,ta,ma 34 │ vlse32.v v1,0(a1),a2 35 │ mul a4,a2,a5 36 │ sub a3,a3,a5 37 │ vadd.vv v1,v1,v2 38 │ vsse32.v v1,0(a0),a2 39 │ add a1,a1,a4 40 │ add a0,a0,a4 The below test suites are passed for this patch: * The riscv fully regression test. gcc/ChangeLog: * config/riscv/autovec.md (mask_len_strided_load_<mode>): Add new pattern for MASK_LEN_STRIDED_LOAD. (mask_len_strided_store_<mode>): Ditto but for store. * config/riscv/riscv-protos.h (expand_strided_load): Add new func decl to expand strided load. (expand_strided_store): Ditto but for store. * config/riscv/riscv-v.cc (expand_strided_load): Add new func impl to expand strided load. (expand_strided_store): Ditto but for store. Signed-off-by: Pan Li <pan2.li@intel.com> Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>
2024-10-28[target/117316] Fix initializer for riscv code alignment handlingJeff Law1-3/+27
The construct used for initializing the code alignments in a recent change is causing bootstrap problems on riscv64 as seen in the referenced bugzilla. This patch adjusts the initializer by pushing the NULL down into each uarch clause. Bootstrapped on riscv64, regression test in flight, but given bootstrap is broken it seemed advisable to move this forward now. I'm so much looking forward to the day when we have performant hardware for bootstrap testing... Sigh. Anyway, bootstrapped and installing on the trunk. PR target/117316 gcc/ * config/riscv/riscv.cc (riscv_tune_param): Drop initializer. (*_tune_info): Add initializers for code alignments.
2024-10-28RISC-V:Bugfix for vlmul_ext and vlmul_trunc with NULL return value[pr117286]xuli1-0/+4
This patch fixes following ICE: test.c: In function 'func': test.c:37:24: internal compiler error: Segmentation fault 37 | vfloat16mf2_t vc = __riscv_vlmul_trunc_v_f16m1_f16mf2(vb); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The root cause is that vlmul_trunc has a null return value. gimple_call <__riscv_vlmul_trunc_v_f16m1_f16mf2, NULL, vb_13> ^^^ Passed the rv64gcv_zvfh regression test. Singed-off-by: Li Xu <xuli1@eswincomputing.com> PR target/117286 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Do not expand NULL return. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr117286.c: New test.
2024-10-25gcc: Remove trailing whitespaceJakub Jelinek2-2/+2
I've tried to build stage3 with -Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank added to STRICT_WARN and that expectably resulted in about 2744 unique trailing whitespace warnings and 124837 leading whitespace warnings when excluding *.md files (which obviously is in big part a generator issue). Others from that are generator related, I think those need to be solved later. The following patch just fixes up the easy case (trailing whitespace), which could be easily automated: for i in `find . -name \*.h -o -name \*.cc -o -name \*.c | xargs grep -l '[ ]$' | grep -v testsuite/`; do sed -i -e 's/[ ]*$//' $i; done I've excluded files which I knew are obviously generated or go FE. Is there anything else we'd want to avoid the changes? Due to patch size, I've split it between gcc/ part (this patch) and rest (include/, libiberty/, libgcc/, libcpp/, libstdc++-v3/). 2024-10-24 Jakub Jelinek <jakub@redhat.com> gcc/ * lra-assigns.cc: Remove trailing whitespace. * symtab.cc: Likewise. * stmt.cc: Likewise. * cgraphbuild.cc: Likewise. * cfgcleanup.cc: Likewise. * loop-init.cc: Likewise. * df-problems.cc: Likewise. * diagnostic-macro-unwinding.cc: Likewise. * langhooks.h: Likewise. * except.cc: Likewise. * tree-vect-loop.cc: Likewise. * coverage.cc: Likewise. * hash-table.cc: Likewise. * ggc-page.cc: Likewise. * gimple-ssa-strength-reduction.cc: Likewise. * tree-parloops.cc: Likewise. * internal-fn.cc: Likewise. * ipa-split.cc: Likewise. * calls.cc: Likewise. * reorg.cc: Likewise. * sbitmap.h: Likewise. * omp-offload.cc: Likewise. * cfgrtl.cc: Likewise. * reginfo.cc: Likewise. * gengtype.h: Likewise. * omp-general.h: Likewise. * ipa-comdats.cc: Likewise. * gimple-range-edge.h: Likewise. * tree-ssa-structalias.cc: Likewise. * target.def: Likewise. * basic-block.h: Likewise. * graphite-isl-ast-to-gimple.cc: Likewise. * auto-profile.cc: Likewise. * optabs.cc: Likewise. * gengtype-lex.l: Likewise. * optabs.def: Likewise. * ira-build.cc: Likewise. * ira.cc: Likewise. * function.h: Likewise. * tree-ssa-propagate.cc: Likewise. * gcov-io.cc: Likewise. * builtin-types.def: Likewise. * ddg.cc: Likewise. * lra-spills.cc: Likewise. * cfg.cc: Likewise. * bitmap.cc: Likewise. * gimple-range-gori.h: Likewise. * tree-ssa-loop-im.cc: Likewise. * cfghooks.h: Likewise. * genmatch.cc: Likewise. * explow.cc: Likewise. * lto-streamer-in.cc: Likewise. * graphite-scop-detection.cc: Likewise. * ipa-prop.cc: Likewise. * gcc.cc: Likewise. * vec.h: Likewise. * cfgexpand.cc: Likewise. * config/alpha/vms.h: Likewise. * config/alpha/alpha.cc: Likewise. * config/alpha/driver-alpha.cc: Likewise. * config/alpha/elf.h: Likewise. * config/iq2000/iq2000.h: Likewise. * config/iq2000/iq2000.cc: Likewise. * config/pa/pa-64.h: Likewise. * config/pa/som.h: Likewise. * config/pa/pa.cc: Likewise. * config/pa/pa.h: Likewise. * config/pa/pa32-regs.h: Likewise. * config/c6x/c6x.cc: Likewise. * config/openbsd-stdint.h: Likewise. * config/elfos.h: Likewise. * config/lm32/lm32.cc: Likewise. * config/lm32/lm32.h: Likewise. * config/lm32/lm32-protos.h: Likewise. * config/darwin-c.cc: Likewise. * config/rx/rx.cc: Likewise. * config/host-darwin.h: Likewise. * config/netbsd.h: Likewise. * config/ia64/ia64.cc: Likewise. * config/ia64/freebsd.h: Likewise. * config/avr/avr-c.cc: Likewise. * config/avr/avr.cc: Likewise. * config/avr/avr-arch.h: Likewise. * config/avr/avr.h: Likewise. * config/avr/stdfix.h: Likewise. * config/avr/gen-avr-mmcu-specs.cc: Likewise. * config/avr/avr-log.cc: Likewise. * config/avr/elf.h: Likewise. * config/avr/gen-avr-mmcu-texi.cc: Likewise. * config/avr/avr-devices.cc: Likewise. * config/nvptx/nvptx.cc: Likewise. * config/vx-common.h: Likewise. * config/sol2.cc: Likewise. * config/rl78/rl78.cc: Likewise. * config/cris/cris.cc: Likewise. * config/arm/symbian.h: Likewise. * config/arm/unknown-elf.h: Likewise. * config/arm/linux-eabi.h: Likewise. * config/arm/arm.cc: Likewise. * config/arm/arm-mve-builtins.h: Likewise. * config/arm/bpabi.h: Likewise. * config/arm/vxworks.h: Likewise. * config/arm/arm.h: Likewise. * config/arm/aout.h: Likewise. * config/arm/elf.h: Likewise. * config/host-linux.cc: Likewise. * config/sh/sh_treg_combine.cc: Likewise. * config/sh/vxworks.h: Likewise. * config/sh/elf.h: Likewise. * config/sh/netbsd-elf.h: Likewise. * config/sh/sh.cc: Likewise. * config/sh/embed-elf.h: Likewise. * config/sh/sh.h: Likewise. * config/darwin-driver.cc: Likewise. * config/m32c/m32c.cc: Likewise. * config/frv/frv.cc: Likewise. * config/openbsd.h: Likewise. * config/aarch64/aarch64-protos.h: Likewise. * config/aarch64/aarch64-builtins.cc: Likewise. * config/aarch64/aarch64-cost-tables.h: Likewise. * config/aarch64/aarch64.cc: Likewise. * config/bfin/bfin.cc: Likewise. * config/bfin/bfin.h: Likewise. * config/bfin/bfin-protos.h: Likewise. * config/i386/gmm_malloc.h: Likewise. * config/i386/djgpp.h: Likewise. * config/i386/sol2.h: Likewise. * config/i386/stringop.def: Likewise. * config/i386/i386-features.cc: Likewise. * config/i386/openbsdelf.h: Likewise. * config/i386/cpuid.h: Likewise. * config/i386/i386.h: Likewise. * config/i386/smmintrin.h: Likewise. * config/i386/avx10_2-512convertintrin.h: Likewise. * config/i386/i386-options.cc: Likewise. * config/i386/i386-opts.h: Likewise. * config/i386/i386-expand.cc: Likewise. * config/i386/avx512dqintrin.h: Likewise. * config/i386/wmmintrin.h: Likewise. * config/i386/gnu-user.h: Likewise. * config/i386/host-mingw32.cc: Likewise. * config/i386/avx10_2bf16intrin.h: Likewise. * config/i386/cygwin.h: Likewise. * config/i386/driver-i386.cc: Likewise. * config/i386/biarch64.h: Likewise. * config/i386/host-cygwin.cc: Likewise. * config/i386/cygming.h: Likewise. * config/i386/i386-builtins.cc: Likewise. * config/i386/avx10_2convertintrin.h: Likewise. * config/i386/i386.cc: Likewise. * config/i386/gas.h: Likewise. * config/i386/freebsd.h: Likewise. * config/mingw/winnt-cxx.cc: Likewise. * config/mingw/winnt.cc: Likewise. * config/h8300/h8300.cc: Likewise. * config/host-solaris.cc: Likewise. * config/m32r/m32r.h: Likewise. * config/m32r/m32r.cc: Likewise. * config/darwin.h: Likewise. * config/sparc/linux64.h: Likewise. * config/sparc/sparc-protos.h: Likewise. * config/sparc/sysv4.h: Likewise. * config/sparc/sparc.h: Likewise. * config/sparc/linux.h: Likewise. * config/sparc/freebsd.h: Likewise. * config/sparc/sparc.cc: Likewise. * config/gcn/gcn-run.cc: Likewise. * config/gcn/gcn.cc: Likewise. * config/gcn/gcn-tree.cc: Likewise. * config/kopensolaris-gnu.h: Likewise. * config/nios2/nios2.h: Likewise. * config/nios2/elf.h: Likewise. * config/nios2/nios2.cc: Likewise. * config/host-netbsd.cc: Likewise. * config/rtems.h: Likewise. * config/pdp11/pdp11.cc: Likewise. * config/pdp11/pdp11.h: Likewise. * config/mn10300/mn10300.cc: Likewise. * config/mn10300/linux.h: Likewise. * config/moxie/moxie.h: Likewise. * config/moxie/moxie.cc: Likewise. * config/rs6000/aix71.h: Likewise. * config/rs6000/vec_types.h: Likewise. * config/rs6000/xcoff.h: Likewise. * config/rs6000/rs6000.cc: Likewise. * config/rs6000/rs6000-internal.h: Likewise. * config/rs6000/rs6000-p8swap.cc: Likewise. * config/rs6000/rs6000-c.cc: Likewise. * config/rs6000/aix.h: Likewise. * config/rs6000/rs6000-logue.cc: Likewise. * config/rs6000/rs6000-string.cc: Likewise. * config/rs6000/rs6000-call.cc: Likewise. * config/rs6000/ppu_intrinsics.h: Likewise. * config/rs6000/altivec.h: Likewise. * config/rs6000/darwin.h: Likewise. * config/rs6000/host-darwin.cc: Likewise. * config/rs6000/freebsd64.h: Likewise. * config/rs6000/spu2vmx.h: Likewise. * config/rs6000/linux.h: Likewise. * config/rs6000/si2vmx.h: Likewise. * config/rs6000/driver-rs6000.cc: Likewise. * config/rs6000/freebsd.h: Likewise. * config/vxworksae.h: Likewise. * config/mips/frame-header-opt.cc: Likewise. * config/mips/mips.h: Likewise. * config/mips/mips.cc: Likewise. * config/mips/sde.h: Likewise. * config/darwin-protos.h: Likewise. * config/mcore/mcore-elf.h: Likewise. * config/mcore/mcore.h: Likewise. * config/mcore/mcore.cc: Likewise. * config/epiphany/epiphany.cc: Likewise. * config/fr30/fr30.h: Likewise. * config/fr30/fr30.cc: Likewise. * config/riscv/riscv-vector-builtins-shapes.cc: Likewise. * config/riscv/riscv-vector-builtins-bases.cc: Likewise. * config/visium/visium.h: Likewise. * config/mmix/mmix.cc: Likewise. * config/v850/v850.cc: Likewise. * config/v850/v850-c.cc: Likewise. * config/v850/v850.h: Likewise. * config/stormy16/stormy16.cc: Likewise. * config/stormy16/stormy16-protos.h: Likewise. * config/stormy16/stormy16.h: Likewise. * config/arc/arc.cc: Likewise. * config/vxworks.cc: Likewise. * config/microblaze/microblaze-c.cc: Likewise. * config/microblaze/microblaze-protos.h: Likewise. * config/microblaze/microblaze.h: Likewise. * config/microblaze/microblaze.cc: Likewise. * config/freebsd-spec.h: Likewise. * config/m68k/m68kelf.h: Likewise. * config/m68k/m68k.cc: Likewise. * config/m68k/netbsd-elf.h: Likewise. * config/m68k/linux.h: Likewise. * config/freebsd.h: Likewise. * config/host-openbsd.cc: Likewise. * regcprop.cc: Likewise. * dumpfile.cc: Likewise. * combine.cc: Likewise. * tree-ssa-forwprop.cc: Likewise. * ipa-profile.cc: Likewise. * hw-doloop.cc: Likewise. * opts.cc: Likewise. * gcc-ar.cc: Likewise. * tree-cfg.cc: Likewise. * incpath.cc: Likewise. * tree-ssa-sccvn.cc: Likewise. * function.cc: Likewise. * genattrtab.cc: Likewise. * rtl.def: Likewise. * genchecksum.cc: Likewise. * profile.cc: Likewise. * df-core.cc: Likewise. * tree-pretty-print.cc: Likewise. * tree.h: Likewise. * plugin.cc: Likewise. * tree-ssa-loop-ch.cc: Likewise. * emit-rtl.cc: Likewise. * haifa-sched.cc: Likewise. * gimple-range-edge.cc: Likewise. * range-op.cc: Likewise. * tree-ssa-ccp.cc: Likewise. * dwarf2cfi.cc: Likewise. * recog.cc: Likewise. * vtable-verify.cc: Likewise. * system.h: Likewise. * regrename.cc: Likewise. * tree-ssa-dom.cc: Likewise. * loop-unroll.cc: Likewise. * lra-constraints.cc: Likewise. * pretty-print.cc: Likewise. * ifcvt.cc: Likewise. * ipa.cc: Likewise. * alloc-pool.h: Likewise. * collect2.cc: Likewise. * pointer-query.cc: Likewise. * cfgloop.cc: Likewise. * toplev.cc: Likewise. * sese.cc: Likewise. * gengtype.cc: Likewise. * gimplify-me.cc: Likewise. * double-int.cc: Likewise. * bb-reorder.cc: Likewise. * dwarf2out.cc: Likewise. * tree-ssa-loop-ivcanon.cc: Likewise. * tree-ssa-reassoc.cc: Likewise. * cgraph.cc: Likewise. * sel-sched.cc: Likewise. * attribs.cc: Likewise. * expr.cc: Likewise. * tree-ssa-scopedtables.h: Likewise. * gimple-range-cache.cc: Likewise. * ipa-pure-const.cc: Likewise. * tree-inline.cc: Likewise. * genhooks.cc: Likewise. * gimple-range-phi.h: Likewise. * shrink-wrap.cc: Likewise. * tree.cc: Likewise. * gimple.cc: Likewise. * backend.h: Likewise. * opts-common.cc: Likewise. * cfg-flags.def: Likewise. * gcse-common.cc: Likewise. * tree-ssa-scopedtables.cc: Likewise. * ccmp.cc: Likewise. * builtins.def: Likewise. * builtin-attrs.def: Likewise. * postreload.cc: Likewise. * sched-deps.cc: Likewise. * ipa-inline-transform.cc: Likewise. * tree-vect-generic.cc: Likewise. * ipa-polymorphic-call.cc: Likewise. * builtins.cc: Likewise. * sel-sched-ir.cc: Likewise. * trans-mem.cc: Likewise. * ipa-visibility.cc: Likewise. * cgraph.h: Likewise. * tree-ssa-phiopt.cc: Likewise. * genopinit.cc: Likewise. * ipa-inline.cc: Likewise. * omp-low.cc: Likewise. * ipa-utils.cc: Likewise. * tree-ssa-math-opts.cc: Likewise. * tree-ssa-ifcombine.cc: Likewise. * gimple-range.cc: Likewise. * ipa-fnsummary.cc: Likewise. * ira-color.cc: Likewise. * value-prof.cc: Likewise. * varasm.cc: Likewise. * ipa-icf.cc: Likewise. * ira-emit.cc: Likewise. * lto-streamer.h: Likewise. * lto-wrapper.cc: Likewise. * regs.h: Likewise. * gengtype-parse.cc: Likewise. * alias.cc: Likewise. * lto-streamer.cc: Likewise. * real.h: Likewise. * wide-int.h: Likewise. * targhooks.cc: Likewise. * gimple-ssa-warn-access.cc: Likewise. * real.cc: Likewise. * ipa-reference.cc: Likewise. * bitmap.h: Likewise. * ginclude/float.h: Likewise. * ginclude/stddef.h: Likewise. * ginclude/stdarg.h: Likewise. * ginclude/stdatomic.h: Likewise. * optabs.h: Likewise. * sel-sched-ir.h: Likewise. * convert.cc: Likewise. * cgraphunit.cc: Likewise. * lra-remat.cc: Likewise. * tree-if-conv.cc: Likewise. * gcov-dump.cc: Likewise. * tree-predcom.cc: Likewise. * dominance.cc: Likewise. * gimple-range-cache.h: Likewise. * ipa-devirt.cc: Likewise. * rtl.h: Likewise. * ubsan.cc: Likewise. * tree-ssa.cc: Likewise. * ssa.h: Likewise. * cse.cc: Likewise. * jump.cc: Likewise. * hwint.h: Likewise. * caller-save.cc: Likewise. * coretypes.h: Likewise. * ipa-fnsummary.h: Likewise. * tree-ssa-strlen.cc: Likewise. * modulo-sched.cc: Likewise. * cgraphclones.cc: Likewise. * lto-cgraph.cc: Likewise. * hw-doloop.h: Likewise. * data-streamer.h: Likewise. * compare-elim.cc: Likewise. * profile-count.h: Likewise. * tree-vect-loop-manip.cc: Likewise. * ree.cc: Likewise. * reload.cc: Likewise. * tree-ssa-loop-split.cc: Likewise. * tree-into-ssa.cc: Likewise. * gcse.cc: Likewise. * cfgloopmanip.cc: Likewise. * df.h: Likewise. * fold-const.cc: Likewise. * wide-int.cc: Likewise. * gengtype-state.cc: Likewise. * sanitizer.def: Likewise. * tree-ssa-sink.cc: Likewise. * target-hooks-macros.h: Likewise. * tree-ssa-pre.cc: Likewise. * gimple-pretty-print.cc: Likewise. * ipa-utils.h: Likewise. * tree-outof-ssa.cc: Likewise. * tree-ssa-coalesce.cc: Likewise. * gimple-match.h: Likewise. * tree-ssa-loop-niter.cc: Likewise. * tree-loop-distribution.cc: Likewise. * tree-emutls.cc: Likewise. * tree-eh.cc: Likewise. * varpool.cc: Likewise. * ssa-iterators.h: Likewise. * asan.cc: Likewise. * reload1.cc: Likewise. * cfgloopanal.cc: Likewise. * tree-vectorizer.cc: Likewise. * simplify-rtx.cc: Likewise. * opts-global.cc: Likewise. * gimple-ssa-store-merging.cc: Likewise. * expmed.cc: Likewise. * tree-ssa-loop-prefetch.cc: Likewise. * tree-ssa-dse.h: Likewise. * tree-vect-stmts.cc: Likewise. * gimple-fold.cc: Likewise. * lra-coalesce.cc: Likewise. * data-streamer-out.cc: Likewise. * diagnostic.cc: Likewise. * tree-ssa-alias.cc: Likewise. * tree-vect-patterns.cc: Likewise. * common/common-target.def: Likewise. * common/config/rx/rx-common.cc: Likewise. * common/config/msp430/msp430-common.cc: Likewise. * common/config/avr/avr-common.cc: Likewise. * common/config/i386/i386-common.cc: Likewise. * common/config/pdp11/pdp11-common.cc: Likewise. * common/config/rs6000/rs6000-common.cc: Likewise. * common/config/mcore/mcore-common.cc: Likewise. * graphite.cc: Likewise. * gimple-low.cc: Likewise. * genmodes.cc: Likewise. * gimple-loop-jam.cc: Likewise. * lto-streamer-out.cc: Likewise. * predict.cc: Likewise. * omp-expand.cc: Likewise. * gimple-array-bounds.cc: Likewise. * predict.def: Likewise. * opts.h: Likewise. * tree-stdarg.cc: Likewise. * gimplify.cc: Likewise. * ira-lives.cc: Likewise. * loop-doloop.cc: Likewise. * lra.cc: Likewise. * gimple-iterator.h: Likewise. * tree-sra.cc: Likewise. gcc/fortran/ * trans-openmp.cc: Remove trailing whitespace. * trans-common.cc: Likewise. * match.h: Likewise. * scanner.cc: Likewise. * gfortranspec.cc: Likewise. * io.cc: Likewise. * iso-c-binding.def: Likewise. * iso-fortran-env.def: Likewise. * types.def: Likewise. * openmp.cc: Likewise. * f95-lang.cc: Likewise. gcc/analyzer/ * state-purge.cc: Remove trailing whitespace. * region-model.h: Likewise. * region-model.cc: Likewise. * program-point.cc: Likewise. * exploded-graph.h: Likewise. * program-state.cc: Likewise. * supergraph.cc: Likewise. gcc/c-family/ * c-ubsan.cc: Remove trailing whitespace. * stub-objc.cc: Likewise. * c-pragma.cc: Likewise. * c-ppoutput.cc: Likewise. * c-indentation.cc: Likewise. * c-ada-spec.cc: Likewise. * c-opts.cc: Likewise. * c-common.cc: Likewise. * c-format.cc: Likewise. * c-omp.cc: Likewise. * c-objc.h: Likewise. * c-cppbuiltin.cc: Likewise. * c-attribs.cc: Likewise. * c-target.def: Likewise. * c-common.h: Likewise. gcc/c/ * c-typeck.cc: Remove trailing whitespace. * gimple-parser.cc: Likewise. * c-parser.cc: Likewise. * c-decl.cc: Likewise. gcc/cp/ * vtable-class-hierarchy.cc: Remove trailing whitespace. * typeck2.cc: Likewise. * decl.cc: Likewise. * init.cc: Likewise. * semantics.cc: Likewise. * module.cc: Likewise. * rtti.cc: Likewise. * cxx-pretty-print.cc: Likewise. * cvt.cc: Likewise. * mangle.cc: Likewise. * name-lookup.h: Likewise. * coroutines.cc: Likewise. * error.cc: Likewise. * lambda.cc: Likewise. * tree.cc: Likewise. * g++spec.cc: Likewise. * decl2.cc: Likewise. * cp-tree.h: Likewise. * parser.cc: Likewise. * pt.cc: Likewise. * call.cc: Likewise. * lex.cc: Likewise. * cp-lang.cc: Likewise. * cp-tree.def: Likewise. * constexpr.cc: Likewise. * typeck.cc: Likewise. * name-lookup.cc: Likewise. * optimize.cc: Likewise. * search.cc: Likewise. * mapper-client.cc: Likewise. * ptree.cc: Likewise. * class.cc: Likewise. gcc/jit/ * docs/examples/tut04-toyvm/toyvm.cc: Remove trailing whitespace. gcc/lto/ * lto-object.cc: Remove trailing whitespace. * lto-symtab.cc: Likewise. * lto-partition.cc: Likewise. * lang-specs.h: Likewise. * lto-lang.cc: Likewise. gcc/objc/ * objc-encoding.cc: Remove trailing whitespace. * objc-map.h: Likewise. * objc-next-runtime-abi-01.cc: Likewise. * objc-act.cc: Likewise. * objc-map.cc: Likewise. gcc/objcp/ * objcp-decl.cc: Remove trailing whitespace. * objcp-lang.cc: Likewise. * objcp-decl.h: Likewise. gcc/rust/ * util/optional.h: Remove trailing whitespace. * util/expected.h: Likewise. * util/rust-unicode-data.h: Likewise. gcc/m2/ * mc-boot/GFpuIO.cc: Remove trailing whitespace. * mc-boot/GFIO.cc: Likewise. * mc-boot/GFormatStrings.cc: Likewise. * mc-boot/GCmdArgs.cc: Likewise. * mc-boot/GDebug.h: Likewise. * mc-boot/GM2Dependent.cc: Likewise. * mc-boot/GRTint.cc: Likewise. * mc-boot/GDebug.cc: Likewise. * mc-boot/GmcError.cc: Likewise. * mc-boot/Gmcp4.cc: Likewise. * mc-boot/GM2RTS.cc: Likewise. * mc-boot/GIO.cc: Likewise. * mc-boot/Gmcp5.cc: Likewise. * mc-boot/GDynamicStrings.cc: Likewise. * mc-boot/Gmcp1.cc: Likewise. * mc-boot/GFormatStrings.h: Likewise. * mc-boot/Gmcp2.cc: Likewise. * mc-boot/Gmcp3.cc: Likewise. * pge-boot/GFIO.cc: Likewise. * pge-boot/GDebug.h: Likewise. * pge-boot/GM2Dependent.cc: Likewise. * pge-boot/GDebug.cc: Likewise. * pge-boot/GM2RTS.cc: Likewise. * pge-boot/GSymbolKey.cc: Likewise. * pge-boot/GIO.cc: Likewise. * pge-boot/GIndexing.cc: Likewise. * pge-boot/GDynamicStrings.cc: Likewise. * pge-boot/GFormatStrings.h: Likewise. gcc/go/ * go-gcc.cc: Remove trailing whitespace. * gospec.cc: Likewise.
2024-10-24Use unique_ptr in more places in pretty_printer/diagnostics [PR116613]David Malcolm3-0/+3
My forthcoming patches for PR other/116613 make much more use of cloning of pretty_printers than before, so it makes sense as a preliminary patch for the result of pretty_printer::clone to be a std::unique_ptr, rather than add more manual uses of "delete". On doing so, I noticed various other places where naked new/delete is used for run-time configuration of diagnostics: * the output format (text vs SARIF) * client data hooks * the option manager * the URLifier Hence this patch also makes use of std::unique_ptr and ::make_unique for managing such client policy classes, and also for diagnostic_buffer's per-format implementations. Unfortunately we can't directly include <memory> in our internal headers but instead any of our TUs that make use of std::unique_ptr must #define INCLUDE_MEMORY before including system.h. Hence the bulk of this patch is taken up with adding a define of INCLUDE_MEMORY to hundreds of source files: everything that includes diagnostic.h or pretty-print.h (and thus anything transitively such as includers of lto-wrapper.h, c-tree.h, cp-tree.h and rtl-ssa.h). Thanks to Gaius Mulley for the parts of the patch that regenerated the m2 files. gcc/ada/ChangeLog: PR other/116613 * gcc-interface/misc.cc: Add #define INCLUDE_MEMORY * gcc-interface/trans.cc: Likewise. * gcc-interface/utils.cc: Likewise. gcc/analyzer/ChangeLog: PR other/116613 * analyzer-logging.cc: Add #define INCLUDE_MEMORY (logger::logger): Update for m_pp becoming a unique_ptr. (logger::~logger): Likewise. (logger::log_va_partial): Likewise. (logger::end_log_line): Likewise. * analyzer-logging.h (logger::get_printer): Likewise. (logger::m_pp): Convert to a unique_ptr. * analyzer.cc (make_label_text): Use diagnostic_context::clone_printer and use unique_ptr. (make_label_text_n): Likewise. * bar-chart.cc: Add #define INCLUDE_MEMORY * pending-diagnostic.cc (evdesc::event_desc::formatted_print): Use diagnostic_context::clone_printer and use unique_ptr. * sm-malloc.cc (sufficiently_similar_p): Likewise. * supergraph.cc (supergraph::dump_dot_to_file): Likewise. gcc/c-family/ChangeLog: PR other/116613 * c-ada-spec.cc: Add #define INCLUDE_MEMORY. * c-attribs.cc: Likewise. * c-common.cc: Likewise. * c-format.cc: Likewise. * c-gimplify.cc: Likewise. * c-indentation.cc: Likewise. * c-opts.cc: Likewise. * c-pch.cc: Likewise. * c-pragma.cc: Likewise. * c-pretty-print.cc: Likewise. Add #include "make-unique.h". (c_pretty_printer::clone): Use std::unique_ptr and ::make_unique. * c-pretty-print.h (c_pretty_printer::clone): Use std::unique_ptr. * c-type-mismatch.cc: Add #define INCLUDE_MEMORY. * c-warn.cc: Likewise. gcc/c/ChangeLog: PR other/116613 * c-aux-info.cc: Add #define INCLUDE_MEMORY. * c-convert.cc: Likewise. * c-errors.cc: Likewise. * c-fold.cc: Likewise. * c-lang.cc: Likewise. * c-objc-common.cc: Likewise. (pp_markup::element_quoted_type::print_type): Use unique_ptr. * c-typeck.cc: Add #define INCLUDE_MEMORY. * gimple-parser.cc: Likewise. gcc/cp/ChangeLog: PR other/116613 * call.cc: Add #define INCLUDE_MEMORY. * class.cc: Likewise. * constexpr.cc: Likewise. * constraint.cc: Likewise. * contracts.cc: Likewise. * coroutines.cc: Likewise. * cp-gimplify.cc: Likewise. * cp-lang.cc: Likewise. * cp-objcp-common.cc: Likewise. * cp-ubsan.cc: Likewise. * cvt.cc: Likewise. * cxx-pretty-print.cc: Likewise. Add #include "cp-tree.h". (cxx_pretty_printer::clone): Use std::unique_ptr and ::make_unique. * cxx-pretty-print.h (cxx_pretty_printer::clone): Use std::unique_ptr. * decl2.cc: Add #define INCLUDE_MEMORY. * dump.cc: Likewise. * except.cc: Likewise. * expr.cc: Likewise. * friend.cc: Likewise. * init.cc: Likewise. * lambda.cc: Likewise. * logic.cc: Likewise. * mangle.cc: Likewise. * method.cc: Likewise. * optimize.cc: Likewise. * pt.cc: Likewise. * ptree.cc: Likewise. * rtti.cc: Likewise. * search.cc: Likewise. * semantics.cc: Likewise. * tree.cc: Likewise. * typeck.cc: Likewise. * typeck2.cc: Likewise. * vtable-class-hierarchy.cc: Likewise. gcc/d/ChangeLog: PR other/116613 * d-attribs.cc: Add #define INCLUDE_MEMORY. * d-builtins.cc: Likewise. * d-codegen.cc: Likewise. * d-convert.cc: Likewise. * d-diagnostic.cc: Likewise. * d-frontend.cc: Likewise. * d-lang.cc: Likewise. * d-longdouble.cc: Likewise. * d-target.cc: Likewise. * decl.cc: Likewise. * expr.cc: Likewise. * intrinsics.cc: Likewise. * modules.cc: Likewise. * toir.cc: Likewise. * typeinfo.cc: Likewise. * types.cc: Likewise. gcc/fortran/ChangeLog: PR other/116613 * arith.cc: Add #define INCLUDE_MEMORY. * array.cc: Likewise. * bbt.cc: Likewise. * check.cc: Likewise. * class.cc: Likewise. * constructor.cc: Likewise. * convert.cc: Likewise. * cpp.cc: Likewise. * data.cc: Likewise. * decl.cc: Likewise. * dependency.cc: Likewise. * dump-parse-tree.cc: Likewise. * error.cc: Likewise. * expr.cc: Likewise. * f95-lang.cc: Likewise. * frontend-passes.cc: Likewise. * interface.cc: Likewise. * intrinsic.cc: Likewise. * io.cc: Likewise. * iresolve.cc: Likewise. * match.cc: Likewise. * matchexp.cc: Likewise. * misc.cc: Likewise. * module.cc: Likewise. * openmp.cc: Likewise. * options.cc: Likewise. * parse.cc: Likewise. * primary.cc: Likewise. * resolve.cc: Likewise. * scanner.cc: Likewise. * simplify.cc: Likewise. * st.cc: Likewise. * symbol.cc: Likewise. * target-memory.cc: Likewise. * trans-array.cc: Likewise. * trans-common.cc: Likewise. * trans-const.cc: Likewise. * trans-decl.cc: Likewise. * trans-expr.cc: Likewise. * trans-intrinsic.cc: Likewise. * trans-io.cc: Likewise. * trans-openmp.cc: Likewise. * trans-stmt.cc: Likewise. * trans-types.cc: Likewise. * trans.cc: Likewise. gcc/go/ChangeLog: PR other/116613 * go-backend.cc: Add #define INCLUDE_MEMORY. * go-lang.cc: Likewise. gcc/jit/ChangeLog: PR other/116613 * dummy-frontend.cc: Add #define INCLUDE_MEMORY. * jit-playback.cc: Likewise. * jit-recording.cc: Likewise. gcc/lto/ChangeLog: PR other/116613 * lto-common.cc: Add #define INCLUDE_MEMORY. * lto-dump.cc: Likewise. * lto-partition.cc: Likewise. * lto-symtab.cc: Likewise. * lto.cc: Likewise. gcc/m2/ChangeLog: PR other/116613 * gm2-gcc/gcc-consolidation.h: Add #define INCLUDE_MEMORY. * gm2-gcc/m2configure.cc: Likewise. * mc-boot/GASCII.cc: Regenerate. * mc-boot/GASCII.h: Ditto. * mc-boot/GArgs.cc: Ditto. * mc-boot/GArgs.h: Ditto. * mc-boot/GAssertion.cc: Ditto. * mc-boot/GAssertion.h: Ditto. * mc-boot/GBreak.cc: Ditto. * mc-boot/GBreak.h: Ditto. * mc-boot/GCOROUTINES.h: Ditto. * mc-boot/GCmdArgs.cc: Ditto. * mc-boot/GCmdArgs.h: Ditto. * mc-boot/GDebug.cc: Ditto. * mc-boot/GDebug.h: Ditto. * mc-boot/GDynamicStrings.cc: Ditto. * mc-boot/GDynamicStrings.h: Ditto. * mc-boot/GEnvironment.cc: Ditto. * mc-boot/GEnvironment.h: Ditto. * mc-boot/GFIO.cc: Ditto. * mc-boot/GFIO.h: Ditto. * mc-boot/GFormatStrings.cc: Ditto. * mc-boot/GFormatStrings.h: Ditto. * mc-boot/GFpuIO.cc: Ditto. * mc-boot/GFpuIO.h: Ditto. * mc-boot/GIO.cc: Ditto. * mc-boot/GIO.h: Ditto. * mc-boot/GIndexing.cc: Ditto. * mc-boot/GIndexing.h: Ditto. * mc-boot/GM2Dependent.cc: Ditto. * mc-boot/GM2Dependent.h: Ditto. * mc-boot/GM2EXCEPTION.cc: Ditto. * mc-boot/GM2EXCEPTION.h: Ditto. * mc-boot/GM2RTS.cc: Ditto. * mc-boot/GM2RTS.h: Ditto. * mc-boot/GMemUtils.cc: Ditto. * mc-boot/GMemUtils.h: Ditto. * mc-boot/GNumberIO.cc: Ditto. * mc-boot/GNumberIO.h: Ditto. * mc-boot/GPushBackInput.cc: Ditto. * mc-boot/GPushBackInput.h: Ditto. * mc-boot/GRTExceptions.cc: Ditto. * mc-boot/GRTExceptions.h: Ditto. * mc-boot/GRTco.h: Ditto. * mc-boot/GRTentity.h: Ditto. * mc-boot/GRTint.cc: Ditto. * mc-boot/GRTint.h: Ditto. * mc-boot/GSArgs.cc: Ditto. * mc-boot/GSArgs.h: Ditto. * mc-boot/GSFIO.cc: Ditto. * mc-boot/GSFIO.h: Ditto. * mc-boot/GSYSTEM.h: Ditto. * mc-boot/GSelective.h: Ditto. * mc-boot/GStdIO.cc: Ditto. * mc-boot/GStdIO.h: Ditto. * mc-boot/GStorage.cc: Ditto. * mc-boot/GStorage.h: Ditto. * mc-boot/GStrCase.cc: Ditto. * mc-boot/GStrCase.h: Ditto. * mc-boot/GStrIO.cc: Ditto. * mc-boot/GStrIO.h: Ditto. * mc-boot/GStrLib.cc: Ditto. * mc-boot/GStrLib.h: Ditto. * mc-boot/GStringConvert.cc: Ditto. * mc-boot/GStringConvert.h: Ditto. * mc-boot/GSysExceptions.h: Ditto. * mc-boot/GSysStorage.cc: Ditto. * mc-boot/GSysStorage.h: Ditto. * mc-boot/GTimeString.cc: Ditto. * mc-boot/GTimeString.h: Ditto. * mc-boot/GUnixArgs.h: Ditto. * mc-boot/Galists.cc: Ditto. * mc-boot/Galists.h: Ditto. * mc-boot/Gdecl.cc: Ditto. * mc-boot/Gdecl.h: Ditto. * mc-boot/Gdtoa.h: Ditto. * mc-boot/Gerrno.h: Ditto. * mc-boot/Gkeyc.cc: Ditto. * mc-boot/Gkeyc.h: Ditto. * mc-boot/Gldtoa.h: Ditto. * mc-boot/Glibc.h: Ditto. * mc-boot/Glibm.h: Ditto. * mc-boot/Glists.cc: Ditto. * mc-boot/Glists.h: Ditto. * mc-boot/GmcComment.cc: Ditto. * mc-boot/GmcComment.h: Ditto. * mc-boot/GmcComp.cc: Ditto. * mc-boot/GmcComp.h: Ditto. * mc-boot/GmcDebug.cc: Ditto. * mc-boot/GmcDebug.h: Ditto. * mc-boot/GmcError.cc: Ditto. * mc-boot/GmcError.h: Ditto. * mc-boot/GmcFileName.cc: Ditto. * mc-boot/GmcFileName.h: Ditto. * mc-boot/GmcLexBuf.cc: Ditto. * mc-boot/GmcLexBuf.h: Ditto. * mc-boot/GmcMetaError.cc: Ditto. * mc-boot/GmcMetaError.h: Ditto. * mc-boot/GmcOptions.cc: Ditto. * mc-boot/GmcOptions.h: Ditto. * mc-boot/GmcPreprocess.cc: Ditto. * mc-boot/GmcPreprocess.h: Ditto. * mc-boot/GmcPretty.cc: Ditto. * mc-boot/GmcPretty.h: Ditto. * mc-boot/GmcPrintf.cc: Ditto. * mc-boot/GmcPrintf.h: Ditto. * mc-boot/GmcQuiet.cc: Ditto. * mc-boot/GmcQuiet.h: Ditto. * mc-boot/GmcReserved.cc: Ditto. * mc-boot/GmcReserved.h: Ditto. * mc-boot/GmcSearch.cc: Ditto. * mc-boot/GmcSearch.h: Ditto. * mc-boot/GmcStack.cc: Ditto. * mc-boot/GmcStack.h: Ditto. * mc-boot/GmcStream.cc: Ditto. * mc-boot/GmcStream.h: Ditto. * mc-boot/Gmcflex.h: Ditto. * mc-boot/Gmcp1.cc: Ditto. * mc-boot/Gmcp1.h: Ditto. * mc-boot/Gmcp2.cc: Ditto. * mc-boot/Gmcp2.h: Ditto. * mc-boot/Gmcp3.cc: Ditto. * mc-boot/Gmcp3.h: Ditto. * mc-boot/Gmcp4.cc: Ditto. * mc-boot/Gmcp4.h: Ditto. * mc-boot/Gmcp5.cc: Ditto. * mc-boot/Gmcp5.h: Ditto. * mc-boot/GnameKey.cc: Ditto. * mc-boot/GnameKey.h: Ditto. * mc-boot/GsymbolKey.cc: Ditto. * mc-boot/GsymbolKey.h: Ditto. * mc-boot/Gtermios.h: Ditto. * mc-boot/Gtop.cc: Ditto. * mc-boot/Gvarargs.cc: Ditto. * mc-boot/Gvarargs.h: Ditto. * mc-boot/Gwlists.cc: Ditto. * mc-boot/Gwlists.h: Ditto. * mc-boot/Gwrapc.h: Ditto. * mc/keyc.mod (checkGccConfigSystem): Add #define INCLUDE_MEMORY. * pge-boot/GASCII.cc: Regenerate. * pge-boot/GASCII.h: Ditto. * pge-boot/GArgs.cc: Ditto. * pge-boot/GArgs.h: Ditto. * pge-boot/GAssertion.cc: Ditto. * pge-boot/GAssertion.h: Ditto. * pge-boot/GBreak.h: Ditto. * pge-boot/GCmdArgs.h: Ditto. * pge-boot/GDebug.cc: Ditto. * pge-boot/GDebug.h: Ditto. * pge-boot/GDynamicStrings.cc: Ditto. * pge-boot/GDynamicStrings.h: Ditto. * pge-boot/GEnvironment.h: Ditto. * pge-boot/GFIO.cc: Ditto. * pge-boot/GFIO.h: Ditto. * pge-boot/GFormatStrings.h: Ditto. * pge-boot/GFpuIO.h: Ditto. * pge-boot/GIO.cc: Ditto. * pge-boot/GIO.h: Ditto. * pge-boot/GIndexing.cc: Ditto. * pge-boot/GIndexing.h: Ditto. * pge-boot/GLists.cc: Ditto. * pge-boot/GLists.h: Ditto. * pge-boot/GM2Dependent.cc: Ditto. * pge-boot/GM2Dependent.h: Ditto. * pge-boot/GM2EXCEPTION.cc: Ditto. * pge-boot/GM2EXCEPTION.h: Ditto. * pge-boot/GM2RTS.cc: Ditto. * pge-boot/GM2RTS.h: Ditto. * pge-boot/GNameKey.cc: Ditto. * pge-boot/GNameKey.h: Ditto. * pge-boot/GNumberIO.cc: Ditto. * pge-boot/GNumberIO.h: Ditto. * pge-boot/GOutput.cc: Ditto. * pge-boot/GOutput.h: Ditto. * pge-boot/GPushBackInput.cc: Ditto. * pge-boot/GPushBackInput.h: Ditto. * pge-boot/GRTExceptions.cc: Ditto. * pge-boot/GRTExceptions.h: Ditto. * pge-boot/GSArgs.h: Ditto. * pge-boot/GSEnvironment.h: Ditto. * pge-boot/GSFIO.cc: Ditto. * pge-boot/GSFIO.h: Ditto. * pge-boot/GSYSTEM.h: Ditto. * pge-boot/GScan.h: Ditto. * pge-boot/GStdIO.cc: Ditto. * pge-boot/GStdIO.h: Ditto. * pge-boot/GStorage.cc: Ditto. * pge-boot/GStorage.h: Ditto. * pge-boot/GStrCase.cc: Ditto. * pge-boot/GStrCase.h: Ditto. * pge-boot/GStrIO.cc: Ditto. * pge-boot/GStrIO.h: Ditto. * pge-boot/GStrLib.cc: Ditto. * pge-boot/GStrLib.h: Ditto. * pge-boot/GStringConvert.h: Ditto. * pge-boot/GSymbolKey.cc: Ditto. * pge-boot/GSymbolKey.h: Ditto. * pge-boot/GSysExceptions.h: Ditto. * pge-boot/GSysStorage.cc: Ditto. * pge-boot/GSysStorage.h: Ditto. * pge-boot/GTimeString.h: Ditto. * pge-boot/GUnixArgs.h: Ditto. * pge-boot/Gbnflex.cc: Ditto. * pge-boot/Gbnflex.h: Ditto. * pge-boot/Gdtoa.h: Ditto. * pge-boot/Gerrno.h: Ditto. * pge-boot/Gldtoa.h: Ditto. * pge-boot/Glibc.h: Ditto. * pge-boot/Glibm.h: Ditto. * pge-boot/Gpge.cc: Ditto. * pge-boot/Gtermios.h: Ditto. * pge-boot/Gwrapc.h: Ditto. gcc/objc/ChangeLog: PR other/116613 * objc-act.cc: Add #define INCLUDE_MEMORY. * objc-encoding.cc: Likewise. * objc-gnu-runtime-abi-01.cc: Likewise. * objc-lang.cc: Likewise. * objc-next-runtime-abi-01.cc: Likewise. * objc-next-runtime-abi-02.cc: Likewise. * objc-runtime-shared-support.cc: Likewise. gcc/objcp/ChangeLog:: Add #define INCLUDE_MEMORY. PR other/116613 * objcp-decl.cc * objcp-lang.cc: Likewise. gcc/rust/ChangeLog: PR other/116613 * resolve/rust-ast-resolve-expr.cc: Add #define INCLUDE_MEMORY. * rust-attribs.cc: Likewise. * rust-system.h: Likewise. gcc/ChangeLog: PR other/116613 * asan.cc: Add #define INCLUDE_MEMORY. * attribs.cc: Likewise. (attr_access::array_as_string): Use diagnostic_context::clone_printer and use unique_ptr. * auto-profile.cc: Add #define INCLUDE_MEMORY. * calls.cc: Likewise. * cfganal.cc: Likewise. * cfgexpand.cc: Likewise. * cfghooks.cc: Likewise. * cfgloop.cc: Likewise. * cgraph.cc: Likewise. * cgraphclones.cc: Likewise. * cgraphunit.cc: Likewise. * collect-utils.cc: Likewise. * collect2.cc: Likewise. * common/config/aarch64/aarch64-common.cc: Likewise. * common/config/arm/arm-common.cc: Likewise. * common/config/avr/avr-common.cc: Likewise. * config/aarch64/aarch64-cc-fusion.cc: Likewise. * config/aarch64/aarch64-early-ra.cc: Likewise. * config/aarch64/aarch64-sve-builtins.cc: Likewise. * config/arc/arc.cc: Likewise. * config/arm/aarch-common.cc: Likewise. * config/arm/arm-mve-builtins.cc: Likewise. * config/avr/avr-devices.cc: Likewise. * config/avr/driver-avr.cc: Likewise. * config/bpf/bpf.cc: Likewise. * config/bpf/btfext-out.cc: Likewise. * config/bpf/core-builtins.cc: Likewise. * config/darwin.cc: Likewise. * config/i386/driver-i386.cc: Likewise. * config/i386/i386-builtins.cc: Likewise. * config/i386/i386-expand.cc: Likewise. * config/i386/i386-features.cc: Likewise. * config/i386/i386-options.cc: Likewise. * config/loongarch/loongarch-builtins.cc: Likewise. * config/mingw/winnt-cxx.cc: Likewise. * config/mingw/winnt.cc: Likewise. * config/mips/mips.cc: Likewise. * config/msp430/driver-msp430.cc: Likewise. * config/nvptx/mkoffload.cc: Likewise. * config/nvptx/nvptx.cc: Likewise. * config/riscv/riscv-avlprop.cc: Likewise. * config/riscv/riscv-vector-builtins.cc: Likewise. * config/riscv/riscv-vsetvl.cc: Likewise. * config/rs6000/driver-rs6000.cc: Likewise. * config/rs6000/host-darwin.cc: Likewise. * config/rs6000/rs6000-c.cc: Likewise. * config/s390/s390-c.cc: Likewise. * config/s390/s390.cc: Likewise. * config/sol2-cxx.cc: Likewise. * config/vms/vms-c.cc: Likewise. * config/xtensa/xtensa-dynconfig.cc: Likewise. * coroutine-passes.cc: Likewise. * coverage.cc: Likewise. * data-streamer-in.cc: Likewise. * data-streamer-out.cc: Likewise. * data-streamer.cc: Likewise. * diagnostic-buffer.h (diagnostic_buffer::~diagnostic_buffer): Delete. (diagnostic_buffer::m_per_format_buffer): Use std::unique_ptr. * diagnostic-client-data-hooks.h (make_compiler_data_hooks): Use std::unique_ptr for return type. * diagnostic-format-json.cc (json_output_format::make_per_format_buffer): Likewise. (diagnostic_output_format_init_json): Update for usage of std::unique_ptr in set_output_format. * diagnostic-format-sarif.cc (sarif_output_format::make_per_format_buffer): Use std::unique_ptr for return type. (diagnostic_output_format_init_sarif): Update for usage of std::unique_ptr. (test_message_with_embedded_link): Likewise for set_urlifier. * diagnostic-format-text.cc: Add #define INCLUDE_MEMORY. Include "make-unique.h". (diagnostic_text_output_format::set_buffer): Use std::unique_ptr. * diagnostic-format-text.h (diagnostic_text_output_format::set_buffer): Likewise. * diagnostic-format.h (diagnostic_output_format::make_per_format_buffer): Likewise. * diagnostic-global-context.cc: * diagnostic-macro-unwinding.cc: Likewise. * diagnostic-show-locus.cc: Likewise. * diagnostic-spec.cc: Likewise. * diagnostic.cc (diagnostic_context::set_output_format): Use std::unique_ptr for input. (diagnostic_context::set_client_data_hooks): Likewise. (diagnostic_context::set_option_manager): Likewise. (diagnostic_context::set_urlifier): Likewise. (diagnostic_context::set_diagnostic_buffer): Update for use of std::unique_ptr. (diagnostic_buffer::diagnostic_buffer): Likewise. (diagnostic_buffer::~diagnostic_buffer): Delete. * diagnostic.h: Complain if INCLUDE_MEMORY was not defined. (diagnostic_context::set_output_format): Use std::unique_ptr for input. (diagnostic_context::set_client_data_hooks): Likewise. (diagnostic_context::set_option_manager): Likewise. (diagnostic_context::set_urlifier): Likewise. (diagnostic_context::clone_printer): New. (diagnostic_context::m_printer): Update comment. (diagnostic_context::m_option_mgr): Likewise. (diagnostic_context::m_urlifier): Likewise. (diagnostic_context::m_edit_context_ptr): Likewise. (diagnostic_context::m_output_format): Likewise. (diagnostic_context::m_client_data_hooks): Likewise. (diagnostic_context::m_theme): Likewise. * digraph.cc: Add #define INCLUDE_MEMORY. * dwarf2out.cc: Likewise. * edit-context.cc: Likewise. * except.cc: Likewise. * expr.cc: Likewise. * file-prefix-map.cc: Likewise. * final.cc: Likewise. * fwprop.cc: Likewise. * gcc-plugin.h: Likewise. * gcc-rich-location.cc: Likewise. * gcc-urlifier.cc: Likewise. Add #include "make-unique.h". (make_gcc_urlifier): Use std::unique_ptr and ::make_unique. * gcc-urlifier.h (make_gcc_urlifier): Use std::unique_ptr. * gcc.cc: Add #define INCLUDE_MEMORY. Include "pretty-print-urlifier.h". * gcov-dump.cc: Add #define INCLUDE_MEMORY. * gcov-tool.cc: Likewise. * gengtype.cc (open_base_files): Likewise to output. * genmatch.cc: Likewise. * gimple-fold.cc: Likewise. * gimple-harden-conditionals.cc: Likewise. * gimple-harden-control-flow.cc: Likewise. * gimple-if-to-switch.cc: Likewise. * gimple-lower-bitint.cc: Likewise. * gimple-predicate-analysis.cc: Likewise. * gimple-pretty-print.cc: Likewise. * gimple-range-cache.cc: Likewise. * gimple-range-edge.cc: Likewise. * gimple-range-fold.cc: Likewise. * gimple-range-gori.cc: Likewise. * gimple-range-infer.cc: Likewise. * gimple-range-op.cc: Likewise. * gimple-range-path.cc: Likewise. * gimple-range-phi.cc: Likewise. * gimple-range-trace.cc: Likewise. * gimple-range.cc: Likewise. * gimple-ssa-backprop.cc: Likewise. * gimple-ssa-sprintf.cc: Likewise. * gimple-ssa-store-merging.cc: Likewise. * gimple-ssa-strength-reduction.cc: Likewise. * gimple-ssa-warn-access.cc: Likewise. * gimple-ssa-warn-alloca.cc: Likewise. * gimple-ssa-warn-restrict.cc: Likewise. * gimple-streamer-in.cc: Likewise. * gimple-streamer-out.cc: Likewise. * gimple.cc: Likewise. * gimplify.cc: Likewise. * graph.cc: Likewise. * graphviz.cc: Likewise. * input.cc: Likewise. * ipa-cp.cc: Likewise. * ipa-devirt.cc: Likewise. * ipa-fnsummary.cc: Likewise. * ipa-free-lang-data.cc: Likewise. * ipa-icf-gimple.cc: Likewise. * ipa-icf.cc: Likewise. * ipa-inline-analysis.cc: Likewise. * ipa-inline.cc: Likewise. * ipa-modref-tree.cc: Likewise. * ipa-modref.cc: Likewise. * ipa-param-manipulation.cc: Likewise. * ipa-polymorphic-call.cc: Likewise. * ipa-predicate.cc: Likewise. * ipa-profile.cc: Likewise. * ipa-prop.cc: Likewise. * ipa-pure-const.cc: Likewise. * ipa-reference.cc: Likewise. * ipa-split.cc: Likewise. * ipa-sra.cc: Likewise. * ipa-strub.cc: Likewise. * ipa-utils.cc: Likewise. * langhooks.cc: Likewise. * late-combine.cc: Likewise. * lto-cgraph.cc: Likewise. * lto-compress.cc: Likewise. * lto-opts.cc: Likewise. * lto-section-in.cc: Likewise. * lto-section-out.cc: Likewise. * lto-streamer-in.cc: Likewise. * lto-streamer-out.cc: Likewise. * lto-streamer.cc: Likewise. * lto-wrapper.cc: Likewise. Include "make-unique.h". (main): Use ::make_unique when creating option manager. * multiple_target.cc: Likewise. * omp-expand.cc: Likewise. * omp-general.cc: Likewise. * omp-low.cc: Likewise. * omp-oacc-neuter-broadcast.cc: Likewise. * omp-offload.cc: Likewise. * omp-simd-clone.cc: Likewise. * optc-gen.awk: Likewise in output. * optc-save-gen.awk: Likewise in output. * options-urls-cc-gen.awk: Likewise in output. * opts-common.cc: Likewise. * opts-global.cc: Likewise. * opts.cc: Likewise. * pair-fusion.cc: Likewise. * passes.cc: Likewise. * pointer-query.cc: Likewise. * predict.cc: Likewise. * pretty-print.cc (pretty_printer::clone): Use std::unique_ptr and ::make_unique. * pretty-print.h: Complain if INCLUDE_MEMORY is not defined. (pretty_printer::clone): Use std::unique_ptr. * print-rtl.cc: Add #define INCLUDE_MEMORY. * print-tree.cc: Likewise. * profile-count.cc: Likewise. * range-op-float.cc: Likewise. * range-op-ptr.cc: Likewise. * range-op.cc: Likewise. * range.cc: Likewise. * read-rtl-function.cc: Likewise. * rtl-error.cc: Likewise. * rtl-ssa/accesses.cc: Likewise. * rtl-ssa/blocks.cc: Likewise. * rtl-ssa/changes.cc: Likewise. * rtl-ssa/functions.cc: Likewise. * rtl-ssa/insns.cc: Likewise. * rtl-ssa/movement.cc: Likewise. * rtl-tests.cc: Likewise. * sanopt.cc: Likewise. * sched-rgn.cc: Likewise. * selftest-diagnostic-path.cc: Likewise. * selftest-diagnostic.cc: Likewise. * splay-tree-utils.cc: Likewise. * sreal.cc: Likewise. * stmt.cc: Likewise. * substring-locations.cc: Likewise. * symtab-clones.cc: Likewise. * symtab-thunks.cc: Likewise. * symtab.cc: Likewise. * text-art/box-drawing.cc: Likewise. * text-art/canvas.cc: Likewise. * text-art/ruler.cc: Likewise. * text-art/selftests.cc: Likewise. * text-art/theme.cc: Likewise. * toplev.cc: Likewise. Include "make-unique.h". (general_init): Use ::make_unique when setting option_manager. * trans-mem.cc: Add #define INCLUDE_MEMORY. * tree-affine.cc: Likewise. * tree-call-cdce.cc: Likewise. * tree-cfg.cc: Likewise. * tree-chrec.cc: Likewise. * tree-dfa.cc: Likewise. * tree-diagnostic-client-data-hooks.cc: Include "make-unique.h". (make_compiler_data_hooks): Use std::unique_ptr and ::make_unique. * tree-diagnostic.cc: Add #define INCLUDE_MEMORY. * tree-dump.cc: Likewise. * tree-inline.cc: Likewise. * tree-into-ssa.cc: Likewise. * tree-logical-location.cc: Likewise. * tree-nested.cc: Likewise. * tree-nrv.cc: Likewise. * tree-object-size.cc: Likewise. * tree-outof-ssa.cc: Likewise. * tree-pretty-print.cc: Likewise. * tree-profile.cc: Likewise. * tree-scalar-evolution.cc: Likewise. * tree-sra.cc: Likewise. * tree-ssa-address.cc: Likewise. * tree-ssa-alias.cc: Likewise. * tree-ssa-ccp.cc: Likewise. * tree-ssa-coalesce.cc: Likewise. * tree-ssa-copy.cc: Likewise. * tree-ssa-dce.cc: Likewise. * tree-ssa-dom.cc: Likewise. * tree-ssa-forwprop.cc: Likewise. * tree-ssa-ifcombine.cc: Likewise. * tree-ssa-loop-ch.cc: Likewise. * tree-ssa-loop-im.cc: Likewise. * tree-ssa-loop-manip.cc: Likewise. * tree-ssa-loop-niter.cc: Likewise. * tree-ssa-loop-split.cc: Likewise. * tree-ssa-math-opts.cc: Likewise. * tree-ssa-operands.cc: Likewise. * tree-ssa-phiprop.cc: Likewise. * tree-ssa-pre.cc: Likewise. * tree-ssa-propagate.cc: Likewise. * tree-ssa-reassoc.cc: Likewise. * tree-ssa-sccvn.cc: Likewise. * tree-ssa-scopedtables.cc: Likewise. * tree-ssa-sink.cc: Likewise. * tree-ssa-strlen.cc: Likewise. * tree-ssa-structalias.cc: Likewise. * tree-ssa-ter.cc: Likewise. * tree-ssa-uninit.cc: Likewise. * tree-ssa.cc: Likewise. * tree-ssanames.cc: Likewise. * tree-stdarg.cc: Likewise. * tree-streamer-in.cc: Likewise. * tree-streamer-out.cc: Likewise. * tree-streamer.cc: Likewise. * tree-switch-conversion.cc: Likewise. * tree-tailcall.cc: Likewise. * tree-vrp.cc: Likewise. * tree.cc: Likewise. * ubsan.cc: Likewise. * value-pointer-equiv.cc: Likewise. * value-prof.cc: Likewise. * value-query.cc: Likewise. * value-range-pretty-print.cc: Likewise. * value-range-storage.cc: Likewise. * value-range.cc: Likewise. * value-relation.cc: Likewise. * var-tracking.cc: Likewise. * varpool.cc: Likewise. * vr-values.cc: Likewise. * wide-int-print.cc: Likewise. gcc/testsuite/ChangeLog: PR other/116613 * gcc.dg/plugin/diagnostic_group_plugin.c: Update for use of std::unique_ptr. * gcc.dg/plugin/diagnostic_plugin_xhtml_format.c: Likewise. * gcc.dg/plugin/ggcplug.c: Likewise. libgcc/ChangeLog: PR other/116613 * libgcov-util.c: Add #define INCLUDE_MEMORY. Signed-off-by: David Malcolm <dmalcolm@redhat.com> Co-authored-by: Gaius Mulley <gaiusmod2@gmail.com> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-10-23[PATCH] RISC-V: override alignment of function/jump/loopWang Pengcheng1-0/+15
Just like what AArch64 has done. Signed-off-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com> gcc/ChangeLog: * config/riscv/riscv.cc (struct riscv_tune_param): Add new tune options. (riscv_override_options_internal): Override the default alignment when not optimizing for size.
2024-10-21RISC-V: Implement vector SAT_TRUNC for signed integerPan Li3-0/+84
This patch would like to implement the sstrunc for vector signed integer. Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline)) \ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ WT x = in[i]; \ NT trunc = (NT)x; \ out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX \ ? trunc \ : x < 0 ? NT_MIN : NT_MAX; \ } \ } DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX) Before this patch: 27 │ vsetvli a5,a2,e64,m1,ta,ma 28 │ vle64.v v1,0(a1) 29 │ slli a3,a5,3 30 │ slli a4,a5,2 31 │ sub a2,a2,a5 32 │ add a1,a1,a3 33 │ vadd.vv v0,v1,v5 34 │ vsetvli zero,zero,e32,mf2,ta,ma 35 │ vnsrl.wx v2,v1,a6 36 │ vncvt.x.x.w v1,v1 37 │ vsetvli zero,zero,e64,m1,ta,ma 38 │ vmsgtu.vv v0,v0,v4 39 │ vsetvli zero,zero,e32,mf2,ta,mu 40 │ vneg.v v2,v2 41 │ vxor.vv v1,v2,v3,v0.t 42 │ vse32.v v1,0(a0) 43 │ add a0,a0,a4 44 │ bne a2,zero,.L3 After this patch: 16 │ vsetvli a5,a2,e32,mf2,ta,ma 17 │ vle64.v v1,0(a1) 18 │ slli a3,a5,3 19 │ slli a4,a5,2 20 │ sub a2,a2,a5 21 │ add a1,a1,a3 22 │ vnclip.wi v1,v1,0 23 │ vse32.v v1,0(a0) 24 │ add a0,a0,a4 25 │ bne a2,zero,.L3 The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/autovec.md (sstrunc<mode><v_double_trunc>2): Add new pattern sstrunc for double trunc. (sstrunc<mode><v_quad_trunc>2): Ditto but for quad trunc. (sstrunc<mode><v_oct_trunc>2): Ditto but for oct trunc. * config/riscv/riscv-protos.h (expand_vec_double_sstrunc): Add new func decl to expand double trunc. (expand_vec_quad_sstrunc): Ditto but for quad trunc. (expand_vec_oct_sstrunc): Ditto but for oct trunc. * config/riscv/riscv-v.cc (expand_vec_double_sstrunc): Add new func to expand double trunc. (expand_vec_quad_sstrunc): Ditto but for quad trunc. (expand_vec_oct_sstrunc): Ditto but for oct trunc. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-20Revert "[PATCH 7/7] RISC-V: Disable by pieces for vector setmem length > ↵Jeff Law1-19/+0
UNITS_PER_WORD" This reverts commit 72ceddbfb78dbb95f0808c3eca1765e8cd48b023.
2024-10-19[PATCH][v5] RISC-V: add option -m(no-)autovec-segmentGreg McGary3-2/+11
Add option -m(no-)autovec-segment to enable/disable autovectorizer from emitting vector segment load/store instructions. This is useful for performance experiments. gcc/ChangeLog: * config/riscv/autovec.md (vec_mask_len_load_lanes, vec_mask_len_store_lanes): Predicate with TARGET_VECTOR_AUTOVEC_SEGMENT * config/riscv/riscv-opts.h (TARGET_VECTOR_AUTOVEC_SEGMENT): New macro. * config/riscv/riscv.opt (-m(no-)autovec-segment): New option. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-1.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-2.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-3.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-4.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-5.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-6.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-7.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-1.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-2.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-3.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-4.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-5.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-6.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-7.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-1.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-2.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-3.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-4.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-5.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-6.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-7.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-1.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-2.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-3.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-4.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-5.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-6.c: New test. * gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-7.c: New test. * gcc.target/riscv/rvv/autovec/no-segment.c: New test.
2024-10-19[PATCH 7/7] RISC-V: Disable by pieces for vector setmem length > UNITS_PER_WORDCraig Blackmore1-0/+19
For fast unaligned access targets, by pieces uses up to UNITS_PER_WORD size pieces resulting in more store instructions than needed. For example gcc.target/riscv/rvv/base/setmem-1.c:f1 built with `-O3 -march=rv64gcv -mtune=thead-c906`: ``` f1: vsetivli zero,8,e8,mf2,ta,ma vmv.v.x v1,a1 vsetivli zero,0,e32,mf2,ta,ma sb a1,14(a0) vmv.x.s a4,v1 vsetivli zero,8,e16,m1,ta,ma vmv.x.s a5,v1 vse8.v v1,0(a0) sw a4,8(a0) sh a5,12(a0) ret ``` The slow unaligned access version built with `-O3 -march=rv64gcv` used 15 sb instructions: ``` f1: sb a1,0(a0) sb a1,1(a0) sb a1,2(a0) sb a1,3(a0) sb a1,4(a0) sb a1,5(a0) sb a1,6(a0) sb a1,7(a0) sb a1,8(a0) sb a1,9(a0) sb a1,10(a0) sb a1,11(a0) sb a1,12(a0) sb a1,13(a0) sb a1,14(a0) ret ``` After this patch, the following is generated in both cases: ``` f1: vsetivli zero,15,e8,m1,ta,ma vmv.v.x v1,a1 vse8.v v1,0(a0) ret ``` gcc/ChangeLog: * config/riscv/riscv.cc (riscv_use_by_pieces_infrastructure_p): New function. (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Define. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113469.c: Expect mf2 setmem. * gcc.target/riscv/rvv/base/setmem-2.c: Update f1 to expect straight-line vector memset. * gcc.target/riscv/rvv/base/setmem-3.c: Likewise.
2024-10-19[PATCH 5/7] RISC-V: Move vector memcpy decision making to separate function ↵Craig Blackmore1-56/+87
[NFC] This moves the code for deciding whether to generate a vectorized memcpy, what vector mode to use and whether a loop is needed out of riscv_vector::expand_block_move and into a new function riscv_vector::use_stringop_p so that it can be reused for other string operations. gcc/ChangeLog: * config/riscv/riscv-string.cc (struct stringop_info): New. (expand_block_move): Move decision making code to... (use_vector_stringop_p): ...here.
2024-10-19[PATCH 4/7] RISC-V: Honour -mrvv-max-lmul in riscv_vector::expand_block_moveCraig Blackmore4-38/+54
Unlike the other vector string ops, expand_block_move was using max LMUL m8 regardless of TARGET_MAX_LMUL. The check for whether to generate inline vector code for movmem has been moved from movmem<mode> to riscv_vector::expand_block_move to avoid maintaining multiple versions of similar logic. They already differed on the minimum length for which they would generate vector code. Now that the expand_block_move value is used, movmem will be generated for smaller lengths. Limiting memcpy to m1 caused some memcpy loops to be generated in the calling convention tests which makes it awkward to add suitable scan assembler tests checking the return value being set, so -mrvv-max-lmul=m8 has been added to these tests. Other tests have been adjusted to expect the new memcpy m1 generation where reasonably straight-forward, otherwise -mrvv-max-lmul=m8 has been added. pr111720-[0-9].c regressed because a memcpy loop is generated instead of straight-line. This reveals an existing issue where a redundant straight-line memcpy gets eliminated but a memcpy loop does not (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117205). For example, on pr111720-0.c after this patch: -mrvv-max-lmul=m8: test: lui a5,%hi(.LANCHOR0) li a4,32 addi sp,sp,-32 addi a5,a5,%lo(.LANCHOR0) vsetvli zero,a4,e8,m1,ta,ma vle8.v v8,0(a5) addi sp,sp,32 jr ra -mrvv-max-lmul=m1: test: addi sp,sp,-32 lui a5,%hi(.LANCHOR0) addi a5,a5,%lo(.LANCHOR0) mv a2,sp li a3,32 .L2: vsetvli a4,a3,e8,m1,ta,ma vle8.v v8,0(a5) sub a3,a3,a4 add a5,a5,a4 vse8.v v8,0(a2) add a2,a2,a4 bne a3,zero,.L2 li a5,32 vsetvli zero,a5,e8,m1,ta,ma vle8.v v8,0(sp) addi sp,sp,32 jr ra I have added -mrvv-max-lmul=m8 to pr111720-[0-9].c so that we continue to test the elimination of straight-line memcpy. gcc/ChangeLog: * config/riscv/riscv-protos.h (get_lmul_mode): New prototype. (expand_block_move): Add bool parameter for movmem_p. * config/riscv/riscv-string.cc (riscv_expand_block_move_scalar): Pass movmem_p as false to riscv_vector::expand_block_move. (expand_block_move): Add movmem_p parameter. Return false if loop needed and movmem_p is true. Respect TARGET_MAX_LMUL. * config/riscv/riscv-v.cc (get_lmul_mode): New function. * config/riscv/riscv.md (movmem<mode>): Move checking for whether to generate inline vector code to riscv_vector::expand_block_move by passing movmem_p as true. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113206-1.c: Add -mrvv-max-lmul=m8. * gcc.target/riscv/rvv/autovec/pr113206-2.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: Add -mrvv-max-lmul=m8 and adjust assembly scans. * gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c: Likewise. * gcc.target/riscv/rvv/autovec/vls/spill-4.c: Add -mrvv-max-lmul=m8. * gcc.target/riscv/rvv/autovec/vls/spill-7.c: Likewise. * gcc.target/riscv/rvv/base/cpymem-1.c: Expect m1 in f1 and f2. * gcc.target/riscv/rvv/base/cpymem-2.c: Add -mrvv-max-lmul=m8. * gcc.target/riscv/rvv/base/movmem-1.c: Adjust f1 to a length that will not get vectorized. * gcc.target/riscv/rvv/base/pr111720-0.c: Add -mrvv-max-lmul=m8. * gcc.target/riscv/rvv/base/pr111720-1.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-2.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-3.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-4.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-5.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-6.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-7.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-8.c: Likewise. * gcc.target/riscv/rvv/base/pr111720-9.c: Likewise. * gcc.target/riscv/rvv/vsetvl/pr112929-1.c: Expect memcpy m1 loops. * gcc.target/riscv/rvv/vsetvl/pr112988-1.c: Likewise.
2024-10-18[PATCH 3/7] RISC-V: Fix vector memcpy smaller LMUL generationCraig Blackmore1-3/+5
If riscv_vector::expand_block_move is generating a straight-line memcpy using a predicated store, it tries to use a smaller LMUL to reduce register pressure if it still allows an entire transfer. This happens in the inner loop of riscv_vector::expand_block_move, however, the vmode chosen by this loop gets overwritten later in the function, so I have added the missing break from the outer loop. I have also addressed a couple of issues with the conditions of the if statement within the inner loop. The first condition did not make sense to me: ``` TARGET_MIN_VLEN * lmul <= nunits * BITS_PER_UNIT ``` I think this was supposed to be checking that the length fits within the given LMUL, so I have changed it to do that. The second condition: ``` /* Avoid loosing the option of using vsetivli . */ && (nunits <= 31 * lmul || nunits > 31 * 8) ``` seems to imply that lmul affects the range of AVL immediate that vsetivli can take but I don't think that is correct. Anyway, I don't think this condition is necessary because if we find a suitable mode we should stick with it, regardless of whether it allowed vsetivli, rather than continuing to try larger lmul which would increase register pressure or smaller potential_ew which would increase AVL. I have removed this condition. gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Fix condition for using smaller LMUL. Break outer loop if a suitable vmode has been found. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr112929-1.c: Expect smaller lmul. * gcc.target/riscv/rvv/vsetvl/pr112988-1.c: Likewise. * gcc.target/riscv/rvv/base/cpymem-3.c: New test.
2024-10-18[PATCH 2/7] RISC-V: Fix uninitialized reg in memcpyCraig Blackmore1-2/+1
gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Replace `end` with `length_rtx` in gen_rtx_NE.
2024-10-18[PATCH 1/7] RISC-V: Fix indentation in riscv_vector::expand_block_move [NFC]Craig Blackmore1-16/+16
gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Fix indentation.
2024-10-16Ternary operator formatting fixesJakub Jelinek2-4/+4
While working on PR117028 C2Y changes, I've noticed weird ternary operator formatting (operand1 ? operand2: operand3). The usual formatting is operand1 ? operand2 : operand3 where we have around 18000+ cases of that (counting only what fits on one line) and indent -nbad -bap -nbc -bbo -bl -bli2 -bls -ncdb -nce -cp1 -cs -di2 -ndj \ -nfc1 -nfca -hnl -i2 -ip5 -lp -pcs -psl -nsc -nsob documented in https://www.gnu.org/prep/standards/html_node/Formatting.html#Formatting does the same. Some code was even trying to save space as much as possible and used operand1?operand2:operand3 or operand1 ? operand2:operand3 Today I've grepped for such cases (the grep was '?.*[^ ]:' and I had to skim through various false positives with that where the : matched e.g. stuff inside of strings, or *.md pattern macros or :: scope) and the following patch is a fix for what I found. 2024-10-16 Jakub Jelinek <jakub@redhat.com> gcc/ * attribs.cc (lookup_scoped_attribute_spec): ?: operator formatting fixes. * basic-block.h (FOR_BB_INSNS_SAFE): Likewise. * cfgcleanup.cc (outgoing_edges_match): Likewise. * cgraph.cc (cgraph_node::dump): Likewise. * config/arc/arc.cc (gen_acc1, gen_acc2): Likewise. * config/arc/arc.h (CLASS_MAX_NREGS, CONSTANT_ADDRESS_P): Likewise. * config/arm/arm.cc (arm_print_operand): Likewise. * config/cris/cris.md (*b<rnzcond:code><mode>): Likewise. * config/darwin.cc (darwin_asm_declare_object_name, darwin_emit_common): Likewise. * config/darwin-driver.cc (darwin_driver_init): Likewise. * config/epiphany/epiphany.md (call, sibcall, call_value, sibcall_value): Likewise. * config/i386/i386.cc (gen_push2): Likewise. * config/i386/i386.h (ix86_cur_cost): Likewise. * config/i386/openbsdelf.h (FUNCTION_PROFILER): Likewise. * config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Likewise. * config/loongarch/loongarch-cpu.cc (fill_native_cpu_config): Likewise. * config/riscv/riscv.cc (riscv_union_memmodels): Likewise. * config/riscv/zc.md (*mva01s<X:mode>, *mvsa01<X:mode>): Likewise. * config/rs6000/mmintrin.h (_mm_cmpeq_pi8, _mm_cmpgt_pi8, _mm_cmpeq_pi16, _mm_cmpgt_pi16, _mm_cmpeq_pi32, _mm_cmpgt_pi32): Likewise. * config/v850/predicates.md (pattern_is_ok_for_prologue): Likewise. * config/xtensa/constraints.md (d, C, W): Likewise. * coverage.cc (coverage_begin_function, build_init_ctor, build_gcov_exit_decl): Likewise. * df-problems.cc (df_create_unused_note): Likewise. * diagnostic.cc (diagnostic_set_caret_max_width): Likewise. * diagnostic-path.cc (path_summary::path_summary): Likewise. * expr.cc (expand_expr_divmod): Likewise. * gcov.cc (format_gcov): Likewise. * gcov-dump.cc (dump_gcov_file): Likewise. * genmatch.cc (main): Likewise. * incpath.cc (remove_duplicates, register_include_chains): Likewise. * ipa-devirt.cc (dump_odr_type): Likewise. * ipa-icf.cc (sem_item_optimizer::merge_classes): Likewise. * ipa-inline.cc (inline_small_functions): Likewise. * ipa-polymorphic-call.cc (ipa_polymorphic_call_context::dump): Likewise. * ipa-sra.cc (create_parameter_descriptors): Likewise. * ipa-utils.cc (find_always_executed_bbs): Likewise. * predict.cc (predict_loops): Likewise. * selftest.cc (read_file): Likewise. * sreal.h (SREAL_SIGN, SREAL_ABS): Likewise. * tree-dump.cc (dequeue_and_dump): Likewise. * tree-ssa-ccp.cc (bit_value_binop): Likewise. gcc/c-family/ * c-opts.cc (c_common_init_options, c_common_handle_option, c_common_finish, set_std_c89, set_std_c99, set_std_c11, set_std_c17, set_std_c23, set_std_cxx98, set_std_cxx11, set_std_cxx14, set_std_cxx17, set_std_cxx20, set_std_cxx23, set_std_cxx26): ?: operator formatting fixes. gcc/cp/ * search.cc (lookup_member): ?: operator formatting fixes. * typeck.cc (cp_build_modify_expr): Likewise. libcpp/ * expr.cc (interpret_float_suffix): ?: operator formatting fixes.
2024-10-16RISC-V: Use biggest_mode as mode for constants.Robin Dapp1-4/+10
In compute_nregs_for_mode we expect that the current variable's mode is at most as large as the biggest mode to be used for vectorization. This might not be true for constants as they don't actually have a mode. In that case, just use the biggest mode so max_number_of_live_regs returns 1. This fixes several test cases in the test suite. gcc/ChangeLog: PR target/116655 * config/riscv/riscv-vector-costs.cc (max_number_of_live_regs): Use biggest mode instead of constant's saved mode. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr116655.c: New test.
2024-10-12[RISC-V] Avoid unnecessary extensions when value is already extendedJivan Hakobyan1-2/+18
This is a minor patch from Jivan from roughly a year ago. The basic idea here is similar to what we do when extending values for the sake of comparisons. Specifically if the value is already known to be properly extended, then an extension is just a copy. The original idea was to use a similar patch, but which aborted to identify cases where these unnecessary promotions where emitted. All that showed up when doing a testsuite run with that abort was the promotions created by the arithmetic with overflow patterns such as addv. Things like addv aren't *that* common so this never got high on my todo list, even after a minor issue in this space was raised in bugzilla. But with stage1 closing soon and no good reason not to go forward, I'm submitting this into the pre-commit tester now. My tester has been using it since roughly Feb :-) Plan would be to commit after the pre-commit tester renders its verdict. * config/riscv/riscv.md (zero_extendsidi2): If RHS is already zero extended, then this is just a copy. (extendsidi2): Similarly, but for sign extension.
2024-10-12RISC-V] Slightly improve broadcasting small constants into vectorsJeff Law2-6/+21
I probably spent way more time on this than it's worth... I was looking at the code we generate for vector SAD and noticed that we were being a bit silly. Specifically: li a4,0 # 272 [c=4 l=4] *movsi_internal/1 Followed shortly by: vmv.s.x v3,a4 # 261 [c=4 l=4] *pred_broadcastrvvm1si/6 And no other uses of a4. We could have used x0 trivially. First we adjust the expander so that it doesn't force the constant into a register. In the matching pattern we change the appropriate source constraints from "r" to "rJ" and the output template is changed to use %z for the operand. The net is we drop the li completely and emit vmv.s.x,v3,x0. But wait, there's more. If we're broadcasting a constant in the range [-16..15] into a vector, we currently load the constant into a register and use vmv.v.r. We can instead use vmv.v.i, which avoids loading the constant into a GPR. For that case we again avoid forcing the constant into a register in the expander and adjust the output template to emit vmv.v.x or vmv.v.i based on whether or not the appropriate operand is a constant or general purpose register. So again, we'll drop a load immediate into a scalar for this case. Whether or not we should use vmv.v.i vs vmv.s.x for loading [-16..15] into the 0th element is probably uarch dependent. The tradeoff is loading the GPR vs the broadcast in the vector unit. I didn't bother with this case. Tested in my tester (which tests rv64gcv as a default codegen option). Will wait for the pre-commit tester to render a verdict. gcc/ * config/riscv/constraints.md (P): New constraint. * config/riscv/vector.md (pred_broadcast<mode> expander): Do not force small integers into GPRs so aggressively. (pred_broadcast<mode> insn & splitter): Allow splatting small constants across the vector register directly. Allow splatting (const_int 0) into element 0 directly.
2024-10-12RISC-V: Implement vector SAT_SUB for signed integerPan Li3-0/+21
This patch would like to implement the sssub for vector signed integer. Form 1: #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \ void __attribute__((noinline)) \ vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \ { \ T x = op_1[i]; \ T y = op_2[i]; \ T minus = (UT)x - (UT)y; \ out[i] = (x ^ y) >= 0 \ ? minus \ : (minus ^ x) >= 0 \ ? minus \ : x < 0 ? MIN : MAX; \ } \ } DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX) Before this patch: 28 │ vle8.v v1,0(a1) 29 │ vle8.v v2,0(a2) 30 │ sub a3,a3,a5 31 │ add a1,a1,a5 32 │ add a2,a2,a5 33 │ vsra.vi v4,v1,7 34 │ vsub.vv v3,v1,v2 35 │ vxor.vv v2,v1,v2 36 │ vxor.vv v0,v1,v3 37 │ vmslt.vi v2,v2,0 38 │ vmslt.vi v0,v0,0 39 │ vmand.mm v0,v0,v2 40 │ vxor.vv v3,v4,v5,v0.t 41 │ vse8.v v3,0(a0) 42 │ add a0,a0,a5 After this patch: 25 │ vle8.v v1,0(a1) 26 │ vle8.v v2,0(a2) 27 │ sub a3,a3,a5 28 │ add a1,a1,a5 29 │ add a2,a2,a5 30 │ vssub.vv v1,v1,v2 31 │ vse8.v v1,0(a0) 32 │ add a0,a0,a5 The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/autovec.md (sssub<mode>3): Add new pattern for signed SAT_SUB. * config/riscv/riscv-protos.h (expand_vec_sssub): Add new func decl to expand sssub to vssub. * config/riscv/riscv-v.cc (expand_vec_sssub): Add new func impl to expand sssub to vssub. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-10RISC-V:Bugfix for C++ code compilation failure with rv32imafc_zve32f[pr116883]Li Xu1-1/+6
From: xuli <xuli1@eswincomputing.com> Example as follows: int main() { unsigned long arraya[128], arrayb[128], arrayc[128]; for (int i = 0; i < 128; i++) { arraya[i] = arrayb[i] + arrayc[i]; } return 0; } Compiled with -march=rv32imafc_zve32f -mabi=ilp32f, it will cause a compilation issue: riscv_vector.h:40:25: error: ambiguating new declaration of 'vint64m4_t __riscv_vle64(vbool16_t, const long long int*, unsigned int)' 40 | #pragma riscv intrinsic "vector" | ^~~~~~~~ riscv_vector.h:40:25: note: old declaration 'vint64m1_t __riscv_vle64(vbool64_t, const long long int*, unsigned int)' With zvl=32b, vbool16_t is registered in init_builtins() with type_common.precision=0x101 (nunits=2), mode_nunits[E_RVVMF16BI]=[2,2]. Normally, vbool64_t is only valid when TARGET_MIN_VLEN > 32, so vbool64_t is not registered in init_builtins(), meaning vbool64_t=null. In order to implement __attribute__((target("arch=+v"))), we must register all vector types and all RVV intrinsics. Therefore, vbool64_t will be registered by default with zvl=128b in reinit_builtins(), resulting in type_common.precision=0x101 (nunits=2) and mode_nunits[E_RVVMF64BI]=[2,2]. We then get TYPE_VECTOR_SUBPARTS(vbool16_t) == TYPE_VECTOR_SUBPARTS(vbool64_t), calculated using type_common.precision, resulting in 2. Since vbool16_t and vbool64_t have the same element type (boolean_type), the compiler treats them as the same type, leading to a re-declaration conflict. After all types and intrinsics have been registered, processing __attribute__((target("arch=+v"))) will update the parameters option and init_adjust_machine_modes. Therefore, to avoid conflicts, we can choose zvl=4096b for the null type reinit_builtins(). command option zvl=32b type nunits vbool64_t => null vbool32_t=> [1,1] vbool16_t=> [2,2] vbool8_t=> [4,4] vbool4_t=> [8,8] vbool2_t=> [16,16] vbool1_t=> [32,32] reinit zvl=128b vbool64_t => [2,2] conflict with zvl32b vbool16_t=> [2,2] reinit zvl=256b vbool64_t => [4,4] conflict with zvl32b vbool8_t=> [4,4] reinit zvl=512b vbool64_t => [8,8] conflict with zvl32b vbool4_t=> [8,8] reinit zvl=1024b vbool64_t => [16,16] conflict with zvl32b vbool2_t=> [16,16] reinit zvl=2048b vbool64_t => [32,32] conflict with zvl32b vbool1_t=> [32,32] reinit zvl=4096b vbool64_t => [64,64] zvl=4096b is ok Signed-off-by: xuli <xuli1@eswincomputing.com> PR target/116883 gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_pragma_intrinsic_flags_pollute): Choose zvl4096b to initialize null type. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/pr116883.C: New test.
2024-10-09RISC-V: Optimize branches with shifted immediate operandsJovan Vukic3-0/+48
After the valuable feedback I received, it’s clear to me that the oversight was in the tests showing the benefits of the patch. In the test file, I added functions f5 and f6, which now generate more efficient code with fewer instructions. Before the patch: f5: li a4,2097152 addi a4,a4,-2048 li a5,1167360 and a0,a0,a4 addi a5,a5,-2048 beq a0,a5,.L4 f6: li a5,3407872 addi a5,a5,-2048 and a0,a0,a5 li a5,1114112 beq a0,a5,.L7 After the patch: f5: srli a5,a0,11 andi a5,a5,1023 li a4,569 beq a5,a4,.L5 f6: srli a5,a0,11 andi a5,a5,1663 li a4,544 beq a5,a4,.L9 PR target/115921 gcc/ChangeLog: * config/riscv/iterators.md (any_eq): New code iterator. * config/riscv/riscv.h (COMMON_TRAILING_ZEROS): New macro. (SMALL_AFTER_COMMON_TRAILING_SHIFT): Ditto. * config/riscv/riscv.md (*branch<ANYI:mode>_shiftedarith_<optab>_shifted): New pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/branch-1.c: Additional tests.