Age | Commit message (Collapse) | Author | Files | Lines |
|
Andreas reported GCC mis-compiled GAS for risc-v Thankfully he also reduced it
to a nice little testcase.
So the whole point of the pattern in question is to "reduce" the constants by
right shifting away common unnecessary bits in RTL expressions like this:
> [(set (pc)
> (if_then_else (any_eq
> (and:ANYI (match_operand:ANYI 1 "register_operand" "r")
> (match_operand 2 "shifted_const_arith_operand" "i"))
> (match_operand 3 "shifted_const_arith_operand" "i"))
> (label_ref (match_operand 0 "" ""))
> (pc)))
When applicable, the reduced constants in operands 2/3 fit into a simm12 and
thus do not need multi-instruction synthesis. Note that we have to also shift
operand 1.
That shift should have been an arithmetic shift, but was incorrectly coded as a
logical shift.
Fixed with the obvious change on the right shift opcode.
Expecting to push to the trunk once the pre-commit tester renders its verdict.
I've already tested in this my tester for rv32 and rv64.
PR target/117649
gcc/
* config/riscv/riscv.md (branch on masked/shifted operands): Use
arithmetic rather than logical shift for operand 1.
gcc/testsuite
* gcc.target/riscv/branch-1.c: Update expected output.
* gcc.target/riscv/pr117649.c: New test.
|
|
When configuring GCC for RV32EC with:
./configure \
--target=riscv32-none-elf \
--with-multilib-generator="rv32ec-ilp32e--" \
--with-abi=ilp32e \
--with-arch=rv32ec
Then the build fails because division is erroneously left enabled:
cc1: error: '-mdiv' requires '-march' to subsume the 'M' extension
-fself-test: 8412281 pass(es) in 0.647173 seconds
Fix by disabling MASK_DIV if multiplication is not available and -mdiv
option has not been explicitly passed.
Tested the above RV32EC-only toolchain using the GNU simulator:
=== gcc Summary ===
# of expected passes 211635
# of unexpected failures 3004
# of expected failures 1061
# of unresolved testcases 5651
# of unsupported tests 18958
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_override_options_internal):
Set division option's default to disabled if multiplication
is not available.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
Instead of loading the permutation indices and using vmslt in order to
determine which elements belong to which source vector we can compute
the proper mask at compile time. That way we can emit vlm instead of
vle + vmslt.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (shuffle_merge_patterns): Load VLS
indices directly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/merge-1.c: Check for vlm and
no vmsleu etc.
* gcc.target/riscv/rvv/autovec/vls/merge-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/merge-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/merge-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/merge-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/merge-6.c: Ditto.
|
|
And stage3 begins...
Zdenek's fuzzer caught this one. Essentially using simplify_gen_subreg
directly with an offset of 0 when we just needed a lowpart.
The offset of 0 works for little endian, but for big endian it's simply wrong.
simplify_gen_subreg will return NULL_RTX because the case isn't representable.
We then embed that NULL_RTX into an insn that's later scanned during
mark_jump_label.
Scanning the port I see a couple more instances of this incorrect idiom. One
is pretty obvious to fix. The others look a bit goofy and I'll probably need
to sync with Patrick on them.
Anyway tested on riscv64-elf and riscv32-elf with no regressions. Pushing to
the trunk.
PR target/117595
gcc/
* config/riscv/sync.md (atomic_compare_and_swap<mode>): Use gen_lowpart
rather than simplify_gen_subreg.
* config/riscv/riscv.cc (riscv_legitimize_move): Similarly.
gcc/testsuite/
* gcc.target/riscv/pr117595.c: New test.
|
|
This patch adds VLS modes to the strided load expanders.
gcc/ChangeLog:
* config/riscv/autovec.md: Add VLS modes.
* config/riscv/vector-iterators.md: Ditto.
* config/riscv/vector.md: Ditto.
|
|
This patch adds else operands to masked loads. Currently the default
else operand predicate just accepts "undefined" (i.e. SCRATCH) values.
PR middle-end/115336
PR middle-end/116059
gcc/ChangeLog:
* config/riscv/autovec.md: Add else operand.
* config/riscv/predicates.md (maskload_else_operand): New
predicate.
* config/riscv/riscv-v.cc (get_else_operand): Remove static.
(expand_load_store): Use get_else_operand and adjust index.
(expand_gather_scatter): Ditto.
(expand_lanes_load_store): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr115336.c: New test.
* gcc.target/riscv/rvv/autovec/pr116059.c: New test.
|
|
Updated version of my prior patch to fix type attributes on the
pre-allocation vector move pattern. This version just adds a suitable
set of attributes to a second pattern that was obviously wrong.
Passed on my tester for rv64 and rv32 crosses. Bootstrapped and
regression tested on riscv64-linux-gnu as well.
--
So I was looking into a horrific schedule for SAD a week or so ago and
came across this gem.
Basically we were treating a vector load as a vector move from a
scheduling standpoint during sched1. Naturally we didn't expose much
ILP during sched1. That in turn caused the register allocator to pack
the pseudos onto the physical vector registers tightly. regrename
didn't do anything useful and the resulting code had too many false
dependencies for sched2 to do anything useful.
As a result we were taking many load->use stalls in x264's SAD routine.
I'm confident the types are fine, but I'm a lot less sure about the
other attributes (mode, avl_type_index, mode_idx). If someone could
take a look at that, it'd be greatly appreciated.
There's other cases that may need similar treatment. But I didn't want
to muck with them until I understood those other attributes and how they
need adjustments.
In particular mov<VLS_AVL_REG:mode><P:mode>_lra appears to have the same
problem.
--
gcc/
* config/riscv/vector.md (mov<mode> pattern/splitter): Fix type and
other attributes.
(mov<VLS_AVL_REG:mode><P:mode>_lra): Likewise.
|
|
error: unrecognizable insn:
(insn 35 34 36 2 (set (subreg:RVVM1SF (reg/v:RVVM1x4SF 142 [ _r ]) 0)
(unspec:RVVM1SF [
(const_vector:RVVM1SF repeat [
(const_double:SF 0.0 [0x0.0p+0])
])
(reg:DI 0 zero)
(const_int 1 [0x1])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_TH_VWLDST)) -1
(nil))
during RTL pass: mode_sw
PR target/116591
gcc/ChangeLog:
* config/riscv/vector.md: Add restriction to call pred_th_whole_mov.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/pr116591.c: New test.
|
|
TARGET_GET_FUNCTION_VERSIONS_DISPATCHER
This patch implements the TARGET_GENERATE_VERSION_DISPATCHER_BODY and
TARGET_GET_FUNCTION_VERSIONS_DISPATCHER for RISC-V. This is used to
generate the dispatcher function and get the dispatcher function for
function multiversioning.
This patch copies many codes from commit 0cfde688e213 ("[aarch64]
Add function multiversioning support") and modifies them to fit the
RISC-V port. A key difference is the data structure of feature bits in
RISC-V C-API is a array of unsigned long long, while in AArch64 is not
a array. So we need to generate the array reference for each feature
bits element in the dispatcher function.
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* config/riscv/riscv.cc (add_condition_to_bb): New function.
(dispatch_function_versions): New function.
(get_suffixed_assembler_name): New function.
(make_resolver_func): New function.
(riscv_generate_version_dispatcher_body): New function.
(riscv_get_function_versions_dispatcher): New function.
(TARGET_GENERATE_VERSION_DISPATCHER_BODY): Implement it.
(TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): Implement it.
|
|
This patch implements the TARGET_MANGLE_DECL_ASSEMBLER_NAME for RISC-V.
This is used to add function multiversioning suffixes to the assembler
name.
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* config/riscv/riscv.cc
(riscv_mangle_decl_assembler_name): New function.
(TARGET_MANGLE_DECL_ASSEMBLER_NAME): Define.
|
|
TARGET_OPTION_FUNCTION_VERSIONS
This patch implements TARGET_COMPARE_VERSION_PRIORITY and
TARGET_OPTION_FUNCTION_VERSIONS for RISC-V.
The TARGET_COMPARE_VERSION_PRIORITY is implemented to compare the
priority of two function versions based on the rules defined in the
RISC-V C-API Doc PR #85:
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85/files#diff-79a93ca266139524b8b642e582ac20999357542001f1f4666fbb62b6fb7a5824R721
If multiple versions have equal priority, we select the function with
the most number of feature bits generated by
riscv_minimal_hwprobe_feature_bits. When it comes to the same number of
feature bits, we diff two versions and select the one with the least
significant bit set. Since a feature appears earlier in the feature_bits
might be more important to performance.
The TARGET_OPTION_FUNCTION_VERSIONS is implemented to check whether the
two function versions are the same. This Implementation reuses the code
in TARGET_COMPARE_VERSION_PRIORITY and check it returns 0, which means
the equal priority.
Co-Developed-by: Hank Chang <hank.chang@sifive.com>
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* config/riscv/riscv.cc
(parse_features_for_version): New function.
(compare_fmv_features): New function.
(riscv_compare_version_priority): New function.
(riscv_common_function_versions): New function.
(TARGET_COMPARE_VERSION_PRIORITY): Implement it.
(TARGET_OPTION_FUNCTION_VERSIONS): Implement it.
|
|
This patch implements the TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P for
RISC-V. This hook is used to process attribute
((target_version ("..."))).
As it is the first patch which introduces the target_version attribute,
we also set TARGET_HAS_FMV_TARGET_ATTRIBUTE to 0 to use "target_version"
for function versioning.
Co-Developed-by: Hank Chang <hank.chang@sifive.com>
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* config/riscv/riscv-protos.h
(riscv_process_target_attr): Remove as it is not used.
(riscv_option_valid_version_attribute_p): Declare.
(riscv_process_target_version_attr): Declare.
* config/riscv/riscv-target-attr.cc
(riscv_target_attrs): Renamed from riscv_attributes.
(riscv_target_version_attrs): New attributes for target_version.
(riscv_process_one_target_attr): New arguments to select attrs.
(riscv_process_target_attr): Likewise.
(riscv_option_valid_attribute_p): Likewise.
(riscv_process_target_version_attr): New function.
(riscv_option_valid_version_attribute_p): New function.
* config/riscv/riscv.cc
(TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): Implement it.
* config/riscv/riscv.h (TARGET_HAS_FMV_TARGET_ATTRIBUTE): Define
it to 0 to use "target_version" for function versioning.
|
|
This patch implements the riscv_minimal_hwprobe_feature_bits feature
for the RISC-V target. The feature bits are defined in the
libgcc/config/riscv/feature_bits.c to provide bitmasks of ISA extensions
that defined in RISC-V C-API. Thus, we need a function to generate the
feature bits for IFUNC resolver to dispatch between different functions
based on the hardware features.
The minimal feature bits means to use the earliest extension appeard in
the Linux hwprobe to cover the given ISA string. To allow older kernels
without some implied extensions probe to run the FMV dispatcher
correctly.
For example, V implies Zve32x, but Zve32x appears in the Linux kernel
since v6.11. If we use isa string directly to generate FMV dispatcher
with functions with "arch=+v" extension, since we have V implied the
Zve32x, FMV dispatcher will check if the Zve32x extension is supported
by the host. If the Linux kernel is older than v6.11, the FMV dispatcher
will fail to detect the Zve32x extension even it already implies by the
V extension, thus making the FMV dispatcher fail to dispatch the correct
function.
Thus, we need to generate the minimal feature bits to cover the given
ISA string to allow the FMV dispatcher to work correctly on older
kernels.
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(RISCV_EXT_BITMASK): New macro.
(struct riscv_ext_bitmask_table_t): New struct.
(riscv_minimal_hwprobe_feature_bits): New function.
* common/config/riscv/riscv-ext-bitmask.def: New file.
* config/riscv/riscv-subset.h (GCC_RISCV_SUBSET_H): Include
riscv-feature-bits.h.
(riscv_minimal_hwprobe_feature_bits): Declare the function.
* config/riscv/riscv-feature-bits.h: New file.
|
|
This patch adds the priority syntax parser to support the Function
Multi-Versioning (FMV) feature in RISC-V. This feature allows users to
specify the priority of the function version in the attribute syntax.
Chnages based on RISC-V C-API PR:
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* config/riscv/riscv-target-attr.cc
(riscv_target_attr_parser::handle_priority): New function.
(riscv_target_attr_parser::update_settings): Update priority
attribute.
* config/riscv/riscv.opt: Add TargetVariable riscv_fmv_priority.
|
|
Some architectures may use ',' in the attribute string, but it is not
used as the separator for different targets. To avoid conflict, we
introduce a new macro TARGET_CLONES_ATTR_SEPARATOR to separate different
clones.
As an example, according to RISC-V C-API Specification [1], RISC-V allows
',' in the attribute string in the "arch=" option to specify one more
ISA extensions in the same target function, which conflict with the
default separator to separate different clones. This patch introduces
TARGET_CLONES_ATTR_SEPARATOR for RISC-V and choose '#' as the separator,
since '#' is not allowed in the target_clones option string.
[1] https://github.com/riscv-non-isa/riscv-c-api-doc/blob/c6c5d6d9cf96b342293315a5dff3d25e96ef8191/src/c-api.adoc#__attribute__targetattr-string
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* defaults.h (TARGET_CLONES_ATTR_SEPARATOR): Define new macro.
* multiple_target.cc (get_attr_str): Use
TARGET_CLONES_ATTR_SEPARATOR to separate attributes.
(separate_attrs): Likewise.
(expand_target_clones): Likewise.
* attribs.cc (attr_strcmp): Likewise.
(sorted_attr_string): Likewise.
* tree.cc (get_target_clone_attr_len): Likewise.
* config/riscv/riscv.h (TARGET_CLONES_ATTR_SEPARATOR): Define
TARGET_CLONES_ATTR_SEPARATOR for RISC-V.
* doc/tm.texi: Document TARGET_CLONES_ATTR_SEPARATOR.
* doc/tm.texi.in: Likewise.
|
|
This patch fixs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117483
If prev and next satisfy the following rules, we should forbid the case
(next.get_sew() < prev.get_sew() && (!next.get_ta() || !next.get_ma()))
in the compatible function max_sew_overlap_and_next_ratio_valid_for_prev_sew_p.
Otherwise, the tail elements of next will be polluted.
DEF_SEW_LMUL_RULE (ge_sew, ratio_and_ge_sew, ratio_and_ge_sew,
max_sew_overlap_and_next_ratio_valid_for_prev_sew_p,
always_false, use_max_sew_and_lmul_with_next_ratio)
Passed the rv64gcv full regression test.
Signed-off-by: Li Xu <xuli1@eswincomputing.com>
PR target/117483
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc: Fix bug.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr117483.c: New test.
|
|
This is a rewrite of a patch originally from Xianmiao Qu. Xianmiao
noticed that the costs we compute for LO_SUM expressions was incorrect.
Essentially we costed based solely on the first input to the LO_SUM.
In a LO_SUM, the first input is almost always going to be a REG and thus
isn't interesting. The second argument is almost always going to be
some kind of symbolic operand, which is much more interesting from a
costing standpoint.
The right way to fix this is to sum the cost of the two operands. I've
verified this produces the same code as Xianmiao's Qu's original patch.
This has been tested on rv32 and rv64 in my tester. It missed today's
bootstrap of riscv64 though :( Naturally I'll wait on the pre-commit CI
tester to render a verdict, but I don't expect any problems.
-- From Xianmiao Qu's original submission --
Currently, the cost of the LO_SUM expression is based on
the cost of calculating the first subexpression. When the
first subexpression is a register, the cost result will
be zero. It seems a bit unreasonable for a SET expression
to have a zero cost when its source is LO_SUM. Moreover,
having a cost of zero for the expression will lead the
loop invariant pass to calculate its benefits of being
moved outside the loop as zero, thus preventing the
out-of-loop placement of the loop invariant.
As an example, consider the following test case:
long a;
long b[];
long *c;
foo () {
for (;;)
*c = b[a];
}
When compiling with -march=rv64gc -mabi=lp64d -Os, the following code is
generated:
.cfi_startproc
lui a5,%hi(c)
ld a4,%lo(c)(a5)
lui a2,%hi(b)
lui a1,%hi(a)
.L2:
ld a5,%lo(a)(a1)
addi a3,a2,%lo(b)
slli a5,a5,3
add a5,a5,a3
ld a5,0(a5)
sd a5,0(a4)
j .L2
After adjust the cost of the LO_SUM expression, the instruction addi will be
moved outside the loop:
.cfi_startproc
lui a5,%hi(c)
ld a3,%lo(c)(a5)
lui a4,%hi(b)
lui a2,%hi(a)
addi a4,a4,%lo(b)
.L2:
ld a5,%lo(a)(a2)
slli a5,a5,3
add a5,a5,a4
ld a5,0(a5)
sd a5,0(a3)
j .L2
gcc/
* config/riscv/riscv.cc (riscv_rtx_costs): Correct costing of LO_SUM
expressions.
Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
|
|
This patch adds norelax function attribute that be discussed in riscv-c-api-doc PR#94.
URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull/94
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_declare_function_name): Add new
attribute.
|
|
So I was looking at sub_dct a little while ago and was surprised to see
us emit two instructions out of a single pattern. We generally try to
avoid that -- it's not always possible, but as a general rule of thumb
it should be avoided. Specifically I saw:
> vmv1r.v v4,v2 # 138 [c=4 l=4] *pred_mul_plusrvvm1hi_undef/5
> vmacc.vv v4,v8,v1
When we emit multiple instructions out of a single pattern we can't
build a good schedule as we can't really describe the two instructions
well and we can't split them up -- they move as an atomic unit.
These cases can also raise correctness issues if the pattern doesn't
properly account for both instructions in its length computation.
Note the length, 4 bytes. So this is both a performance and latent
correctness issue.
It appears that these alternatives are meant to deal with the case when
we have three source inputs and a non-matching output. The author did
put in "?" to slightly disparage these alternatives, but a "!" would
have been better. The best solution is to just remove those
alternatives and let the allocator manage the matching operand issue.
That's precisely what this patch does. For the various integer
multiply-add/multiply-accumulate patterns we drop the alternatives which
don't require a match between the output and one of the inputs.
That fixes the correctness issue and should shave a cycle or two off our
sub_dct code. Essentially the move bubbles up into an empty slot and we
can schedule around the vmacc sensibly.
Interestingly enough this fixes a scan-assembler test in my tester for
both rv32 and rv64.
> Tests that now work, but didn't before (10 tests):
>
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
> unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8
My BPI is already in a bootstrap test, so this patch won't hit the BPI
for bootstrapping until Wednesday, meaning no data until Thursday. Will
wait for the pre-commit tester though.
gcc/
* config/riscv/vector.md (pred_mul_plus<mode>_undef): Drop alternatives
where output doesn't have to match input.
(pred_madd<mode>, pred_macc<mode>): Likewise.
(pred_madd<mode>_scalar, pred_macc<mode>_scalar): Likewise.
(pred_madd<mode>_exended_scalar): Likewise.
(pred_macc<mode>_exended_scalar): Likewise.
(pred_minus_mul<mode>_undef): Likewise.
(pred_nmsub<mode>, pred_nmsac<mode>): Likewise.
(pred_nmsub<mode>_scalar, pred_nmsac<mode>_scalar): Likewise.
(pred_nmsub<mode>_exended_scalar): Likewise.
(pred_nmsac<mode>_exended_scalar): Likewise.
|
|
Just notice the indent is not that right for ustrunc pattern from
the md files. Thus, make it correct. It is somehow very obvious
and will commit it after next 48H if no more comments.
gcc/ChangeLog:
* config/riscv/autovec.md: Fix indent format issue.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
UNITS_PER_WORD
For fast unaligned access targets, by pieces uses up to UNITS_PER_WORD
size pieces resulting in more store instructions than needed. For
example gcc.target/riscv/rvv/base/setmem-2.c:f1 built with
`-O3 -march=rv64gcv -mtune=thead-c906`:
```
f1:
vsetivli zero,8,e8,mf2,ta,ma
vmv.v.x v1,a1
vsetivli zero,0,e32,mf2,ta,ma
sb a1,14(a0)
vmv.x.s a4,v1
vsetivli zero,8,e16,m1,ta,ma
vmv.x.s a5,v1
vse8.v v1,0(a0)
sw a4,8(a0)
sh a5,12(a0)
ret
```
The slow unaligned access version built with `-O3 -march=rv64gcv` used
15 sb instructions:
```
f1:
sb a1,0(a0)
sb a1,1(a0)
sb a1,2(a0)
sb a1,3(a0)
sb a1,4(a0)
sb a1,5(a0)
sb a1,6(a0)
sb a1,7(a0)
sb a1,8(a0)
sb a1,9(a0)
sb a1,10(a0)
sb a1,11(a0)
sb a1,12(a0)
sb a1,13(a0)
sb a1,14(a0)
ret
```
After this patch, the following is generated in both cases:
```
f1:
vsetivli zero,15,e8,m1,ta,ma
vmv.v.x v1,a1
vse8.v v1,0(a0)
ret
```
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_use_by_pieces_infrastructure_p):
New function.
(TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Define.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr113469.c: Expect mf2 setmem.
* gcc.target/riscv/rvv/base/setmem-2.c: Update f1 to expect
straight-line vector memset.
* gcc.target/riscv/rvv/base/setmem-3.c: Likewise.
|
|
`expand_vec_setmem` only generated vectorized memset if it fitted into a
single vector store of at least (TARGET_MIN_VLEN / 8) bytes. Also,
without dynamic LMUL the operation was always TARGET_MAX_LMUL even if it
would have fitted a smaller LMUL.
Allow vectorized memset to be generated for smaller lengths and smaller
LMUL by switching to using use_vector_string_op. Smaller LMUL can be
seen in setmem-3.c:f3. Smaller lengths will be seen after the second
patch in this series which selectively disables by pieces.
gcc/ChangeLog:
* config/riscv/riscv-string.cc
(use_vector_stringop_p): Add comment.
(expand_vec_setmem): Use use_vector_stringop_p instead of
check_vectorise_memory_operation.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/setmem-3.c: Expect smaller lmul.
|
|
When bisecting for ICE in PR/117353, commit 771256bcb9dd ("RISC-V: Emit costs for
bool and stepped const vectors") uncovered yet another latent issue (first noted [1])
[1] https://github.com/patrick-rivos/gcc-postcommit-ci/issues/1625
This patch fixes some of the fortran regressions from that report.
Fixes 71a5ac6703d1 ("RISC-V: Support interleave vector with different step sequence")
rv64imafdcv_zvl256b_zba_zbb_zbs_zicond/lp64d/medlow
| # of unexpected case / # of unique unexpected case
| gcc | g++ | gfortran |
| 392 / 108 | 7 / 3 | 91 / 24 |
| 392 / 108 | 7 / 3 | 67 / 12 |
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_const_vector): Use IOR op.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/slp-interleave-5.c: New test.
Tested-by: Edwin Lu <ewlu@rivosinc.com> # Pre-commit CU #2503
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
When the callee is versioned but the caller is not, we should not inline
the callee into the caller, to prevent the default version of the callee
from being inlined into a not versioned caller.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_can_inline_p): Refuse to inline
when callee is versioned but caller is not.
|
|
This patch splits static bool riscv_process_target_attr
(tree args, location_t loc) into two functions:
- bool riscv_process_target_attr (const char *args, location_t loc)
- static bool riscv_process_target_attr (tree args, location_t loc)
Thus, we can call `riscv_process_target_attr` with a `const char *`
argument. This is useful for implementation of `target_version`
attribute.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (riscv_process_target_attr): New.
* config/riscv/riscv-target-attr.cc (riscv_process_target_attr):
Split into two functions with const char *args argument
|
|
Currently, the RISC-V target uses the target specific mplt option to
control PLT generation. This patch deprecates the target specific mplt
option and uses the common fplt option instead. This allows users to
use the same option for most targets.
Co-Developed-by: Liao Shihua <shihua@iscas.ac.cn>
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* config/riscv/predicates.md: Use flag_plt instead of TARGET_PLT.
* config/riscv/riscv.opt: alias common option fplt to mplt.
|
|
So a while back I was looking at pixel_avg for RISC-V where we try to
use vaaddu for the halfword-ceiling-average step. The problem with
vaaddu is that you must set VXRM to a suitable rounding mode as it has
an undetermined state at function entry or after a function call.
It turns out some designs will fully flush their pipelines on a write to
VXRM which you can imagine is incredibly expensive.
VXRM assignments are handled by an LCM based algorithm to find "optimal"
placement points based on what insns in the stream need VXRM assignments
and the particular mode they need.
Unfortunately in pixel_avg an LCM algorithm only allows hoisting out of
the innermost loop, but not the outer loop. The core issue is that LCM
does not allow any speculation and there are paths which would bypass
the inner loop (which don't actually trigger at runtime IIRC).
The expectation is that VXRM assignments should be exceedingly rare and
needing more than one mode even rarer. So hoisting more aggressively
seems like a reasonable thing to do, but we don't want to burn too much
time trying to do something fancy.
So what this patch does is scan the IL once collecting any VXRM needs.
If the current function has precisely one VXRM mode needed, then we
pretend (for the sake of LCM) that the first instruction in the function
also has that need.
By doing so the VXRM assignment is essentially anticipated everywhere in
the function. The standard LCM algorithm is run and has enough
information to hoist the VXRM assignment more aggressively, most often
to the prologue.
This helps the BPI in a measurable way (IIRC it was 2-3%). It probably
helps some of the SiFive designs, but I've been told they still benefit
from the longer sequence of shifts & adds, hoisting just isn't enough
for those designs. The Ventana design basically doesn't care where the
VXRM assignment is. Point is we may want to have a tuning knob for the
patterns which need VXRM (vaadd[u], vasub[u]) at some point in the near
future.
Bootstrapped and regression tested on riscv64 and regression tested on
riscv32-elf and riscv64-elf. We've been using this internally for a
while a while on spec as well. Obviously I'll wait for the pre-commit
tester to do its thing.
gcc/
* config/riscv/riscv.cc (singleton_vxrm_need): New function.
(riscv_mode_needed): See if there is a singleton need and if so,
claim it happens on the first insn in the chain.
|
|
gcc/ChangeLog:
* config.gcc: Add riscv_cmo.h.
* config/riscv/riscv_cmo.h: New file.
|
|
This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in
the RISC-V backend by leveraging the vector strided load/store insn.
For example:
void foo (int * __restrict a, int * __restrict b, int stride, int n)
{
for (int i = 0; i < n; i++)
a[i*stride] = b[i*stride] + 100;
}
Before this patch:
38 │ vsetvli a5,a3,e32,m1,ta,ma
39 │ vluxei64.v v1,(a1),v4
40 │ mul a4,a2,a5
41 │ sub a3,a3,a5
42 │ vadd.vv v1,v1,v2
43 │ vsuxei64.v v1,(a0),v4
44 │ add a1,a1,a4
45 │ add a0,a0,a4
After this patch:
33 │ vsetvli a5,a3,e32,m1,ta,ma
34 │ vlse32.v v1,0(a1),a2
35 │ mul a4,a2,a5
36 │ sub a3,a3,a5
37 │ vadd.vv v1,v1,v2
38 │ vsse32.v v1,0(a0),a2
39 │ add a1,a1,a4
40 │ add a0,a0,a4
The below test suites are passed for this patch:
* The riscv fully regression test.
gcc/ChangeLog:
* config/riscv/autovec.md (mask_len_strided_load_<mode>): Add
new pattern for MASK_LEN_STRIDED_LOAD.
(mask_len_strided_store_<mode>): Ditto but for store.
* config/riscv/riscv-protos.h (expand_strided_load): Add new
func decl to expand strided load.
(expand_strided_store): Ditto but for store.
* config/riscv/riscv-v.cc (expand_strided_load): Add new
func impl to expand strided load.
(expand_strided_store): Ditto but for store.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai>
|
|
The construct used for initializing the code alignments in a recent change is
causing bootstrap problems on riscv64 as seen in the referenced bugzilla.
This patch adjusts the initializer by pushing the NULL down into each uarch
clause. Bootstrapped on riscv64, regression test in flight, but given
bootstrap is broken it seemed advisable to move this forward now.
I'm so much looking forward to the day when we have performant hardware for
bootstrap testing... Sigh.
Anyway, bootstrapped and installing on the trunk.
PR target/117316
gcc/
* config/riscv/riscv.cc (riscv_tune_param): Drop initializer.
(*_tune_info): Add initializers for code alignments.
|
|
This patch fixes following ICE:
test.c: In function 'func':
test.c:37:24: internal compiler error: Segmentation fault
37 | vfloat16mf2_t vc = __riscv_vlmul_trunc_v_f16m1_f16mf2(vb);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The root cause is that vlmul_trunc has a null return value.
gimple_call <__riscv_vlmul_trunc_v_f16m1_f16mf2, NULL, vb_13>
^^^
Passed the rv64gcv_zvfh regression test.
Singed-off-by: Li Xu <xuli1@eswincomputing.com>
PR target/117286
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc: Do not expand NULL return.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr117286.c: New test.
|
|
I've tried to build stage3 with
-Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank
added to STRICT_WARN and that expectably resulted in about
2744 unique trailing whitespace warnings and 124837 leading whitespace
warnings when excluding *.md files (which obviously is in big part a
generator issue). Others from that are generator related, I think those
need to be solved later.
The following patch just fixes up the easy case (trailing whitespace),
which could be easily automated:
for i in `find . -name \*.h -o -name \*.cc -o -name \*.c | xargs grep -l '[ ]$' | grep -v testsuite/`; do sed -i -e 's/[ ]*$//' $i; done
I've excluded files which I knew are obviously generated or go FE.
Is there anything else we'd want to avoid the changes?
Due to patch size, I've split it between gcc/ part (this patch)
and rest (include/, libiberty/, libgcc/, libcpp/, libstdc++-v3/).
2024-10-24 Jakub Jelinek <jakub@redhat.com>
gcc/
* lra-assigns.cc: Remove trailing whitespace.
* symtab.cc: Likewise.
* stmt.cc: Likewise.
* cgraphbuild.cc: Likewise.
* cfgcleanup.cc: Likewise.
* loop-init.cc: Likewise.
* df-problems.cc: Likewise.
* diagnostic-macro-unwinding.cc: Likewise.
* langhooks.h: Likewise.
* except.cc: Likewise.
* tree-vect-loop.cc: Likewise.
* coverage.cc: Likewise.
* hash-table.cc: Likewise.
* ggc-page.cc: Likewise.
* gimple-ssa-strength-reduction.cc: Likewise.
* tree-parloops.cc: Likewise.
* internal-fn.cc: Likewise.
* ipa-split.cc: Likewise.
* calls.cc: Likewise.
* reorg.cc: Likewise.
* sbitmap.h: Likewise.
* omp-offload.cc: Likewise.
* cfgrtl.cc: Likewise.
* reginfo.cc: Likewise.
* gengtype.h: Likewise.
* omp-general.h: Likewise.
* ipa-comdats.cc: Likewise.
* gimple-range-edge.h: Likewise.
* tree-ssa-structalias.cc: Likewise.
* target.def: Likewise.
* basic-block.h: Likewise.
* graphite-isl-ast-to-gimple.cc: Likewise.
* auto-profile.cc: Likewise.
* optabs.cc: Likewise.
* gengtype-lex.l: Likewise.
* optabs.def: Likewise.
* ira-build.cc: Likewise.
* ira.cc: Likewise.
* function.h: Likewise.
* tree-ssa-propagate.cc: Likewise.
* gcov-io.cc: Likewise.
* builtin-types.def: Likewise.
* ddg.cc: Likewise.
* lra-spills.cc: Likewise.
* cfg.cc: Likewise.
* bitmap.cc: Likewise.
* gimple-range-gori.h: Likewise.
* tree-ssa-loop-im.cc: Likewise.
* cfghooks.h: Likewise.
* genmatch.cc: Likewise.
* explow.cc: Likewise.
* lto-streamer-in.cc: Likewise.
* graphite-scop-detection.cc: Likewise.
* ipa-prop.cc: Likewise.
* gcc.cc: Likewise.
* vec.h: Likewise.
* cfgexpand.cc: Likewise.
* config/alpha/vms.h: Likewise.
* config/alpha/alpha.cc: Likewise.
* config/alpha/driver-alpha.cc: Likewise.
* config/alpha/elf.h: Likewise.
* config/iq2000/iq2000.h: Likewise.
* config/iq2000/iq2000.cc: Likewise.
* config/pa/pa-64.h: Likewise.
* config/pa/som.h: Likewise.
* config/pa/pa.cc: Likewise.
* config/pa/pa.h: Likewise.
* config/pa/pa32-regs.h: Likewise.
* config/c6x/c6x.cc: Likewise.
* config/openbsd-stdint.h: Likewise.
* config/elfos.h: Likewise.
* config/lm32/lm32.cc: Likewise.
* config/lm32/lm32.h: Likewise.
* config/lm32/lm32-protos.h: Likewise.
* config/darwin-c.cc: Likewise.
* config/rx/rx.cc: Likewise.
* config/host-darwin.h: Likewise.
* config/netbsd.h: Likewise.
* config/ia64/ia64.cc: Likewise.
* config/ia64/freebsd.h: Likewise.
* config/avr/avr-c.cc: Likewise.
* config/avr/avr.cc: Likewise.
* config/avr/avr-arch.h: Likewise.
* config/avr/avr.h: Likewise.
* config/avr/stdfix.h: Likewise.
* config/avr/gen-avr-mmcu-specs.cc: Likewise.
* config/avr/avr-log.cc: Likewise.
* config/avr/elf.h: Likewise.
* config/avr/gen-avr-mmcu-texi.cc: Likewise.
* config/avr/avr-devices.cc: Likewise.
* config/nvptx/nvptx.cc: Likewise.
* config/vx-common.h: Likewise.
* config/sol2.cc: Likewise.
* config/rl78/rl78.cc: Likewise.
* config/cris/cris.cc: Likewise.
* config/arm/symbian.h: Likewise.
* config/arm/unknown-elf.h: Likewise.
* config/arm/linux-eabi.h: Likewise.
* config/arm/arm.cc: Likewise.
* config/arm/arm-mve-builtins.h: Likewise.
* config/arm/bpabi.h: Likewise.
* config/arm/vxworks.h: Likewise.
* config/arm/arm.h: Likewise.
* config/arm/aout.h: Likewise.
* config/arm/elf.h: Likewise.
* config/host-linux.cc: Likewise.
* config/sh/sh_treg_combine.cc: Likewise.
* config/sh/vxworks.h: Likewise.
* config/sh/elf.h: Likewise.
* config/sh/netbsd-elf.h: Likewise.
* config/sh/sh.cc: Likewise.
* config/sh/embed-elf.h: Likewise.
* config/sh/sh.h: Likewise.
* config/darwin-driver.cc: Likewise.
* config/m32c/m32c.cc: Likewise.
* config/frv/frv.cc: Likewise.
* config/openbsd.h: Likewise.
* config/aarch64/aarch64-protos.h: Likewise.
* config/aarch64/aarch64-builtins.cc: Likewise.
* config/aarch64/aarch64-cost-tables.h: Likewise.
* config/aarch64/aarch64.cc: Likewise.
* config/bfin/bfin.cc: Likewise.
* config/bfin/bfin.h: Likewise.
* config/bfin/bfin-protos.h: Likewise.
* config/i386/gmm_malloc.h: Likewise.
* config/i386/djgpp.h: Likewise.
* config/i386/sol2.h: Likewise.
* config/i386/stringop.def: Likewise.
* config/i386/i386-features.cc: Likewise.
* config/i386/openbsdelf.h: Likewise.
* config/i386/cpuid.h: Likewise.
* config/i386/i386.h: Likewise.
* config/i386/smmintrin.h: Likewise.
* config/i386/avx10_2-512convertintrin.h: Likewise.
* config/i386/i386-options.cc: Likewise.
* config/i386/i386-opts.h: Likewise.
* config/i386/i386-expand.cc: Likewise.
* config/i386/avx512dqintrin.h: Likewise.
* config/i386/wmmintrin.h: Likewise.
* config/i386/gnu-user.h: Likewise.
* config/i386/host-mingw32.cc: Likewise.
* config/i386/avx10_2bf16intrin.h: Likewise.
* config/i386/cygwin.h: Likewise.
* config/i386/driver-i386.cc: Likewise.
* config/i386/biarch64.h: Likewise.
* config/i386/host-cygwin.cc: Likewise.
* config/i386/cygming.h: Likewise.
* config/i386/i386-builtins.cc: Likewise.
* config/i386/avx10_2convertintrin.h: Likewise.
* config/i386/i386.cc: Likewise.
* config/i386/gas.h: Likewise.
* config/i386/freebsd.h: Likewise.
* config/mingw/winnt-cxx.cc: Likewise.
* config/mingw/winnt.cc: Likewise.
* config/h8300/h8300.cc: Likewise.
* config/host-solaris.cc: Likewise.
* config/m32r/m32r.h: Likewise.
* config/m32r/m32r.cc: Likewise.
* config/darwin.h: Likewise.
* config/sparc/linux64.h: Likewise.
* config/sparc/sparc-protos.h: Likewise.
* config/sparc/sysv4.h: Likewise.
* config/sparc/sparc.h: Likewise.
* config/sparc/linux.h: Likewise.
* config/sparc/freebsd.h: Likewise.
* config/sparc/sparc.cc: Likewise.
* config/gcn/gcn-run.cc: Likewise.
* config/gcn/gcn.cc: Likewise.
* config/gcn/gcn-tree.cc: Likewise.
* config/kopensolaris-gnu.h: Likewise.
* config/nios2/nios2.h: Likewise.
* config/nios2/elf.h: Likewise.
* config/nios2/nios2.cc: Likewise.
* config/host-netbsd.cc: Likewise.
* config/rtems.h: Likewise.
* config/pdp11/pdp11.cc: Likewise.
* config/pdp11/pdp11.h: Likewise.
* config/mn10300/mn10300.cc: Likewise.
* config/mn10300/linux.h: Likewise.
* config/moxie/moxie.h: Likewise.
* config/moxie/moxie.cc: Likewise.
* config/rs6000/aix71.h: Likewise.
* config/rs6000/vec_types.h: Likewise.
* config/rs6000/xcoff.h: Likewise.
* config/rs6000/rs6000.cc: Likewise.
* config/rs6000/rs6000-internal.h: Likewise.
* config/rs6000/rs6000-p8swap.cc: Likewise.
* config/rs6000/rs6000-c.cc: Likewise.
* config/rs6000/aix.h: Likewise.
* config/rs6000/rs6000-logue.cc: Likewise.
* config/rs6000/rs6000-string.cc: Likewise.
* config/rs6000/rs6000-call.cc: Likewise.
* config/rs6000/ppu_intrinsics.h: Likewise.
* config/rs6000/altivec.h: Likewise.
* config/rs6000/darwin.h: Likewise.
* config/rs6000/host-darwin.cc: Likewise.
* config/rs6000/freebsd64.h: Likewise.
* config/rs6000/spu2vmx.h: Likewise.
* config/rs6000/linux.h: Likewise.
* config/rs6000/si2vmx.h: Likewise.
* config/rs6000/driver-rs6000.cc: Likewise.
* config/rs6000/freebsd.h: Likewise.
* config/vxworksae.h: Likewise.
* config/mips/frame-header-opt.cc: Likewise.
* config/mips/mips.h: Likewise.
* config/mips/mips.cc: Likewise.
* config/mips/sde.h: Likewise.
* config/darwin-protos.h: Likewise.
* config/mcore/mcore-elf.h: Likewise.
* config/mcore/mcore.h: Likewise.
* config/mcore/mcore.cc: Likewise.
* config/epiphany/epiphany.cc: Likewise.
* config/fr30/fr30.h: Likewise.
* config/fr30/fr30.cc: Likewise.
* config/riscv/riscv-vector-builtins-shapes.cc: Likewise.
* config/riscv/riscv-vector-builtins-bases.cc: Likewise.
* config/visium/visium.h: Likewise.
* config/mmix/mmix.cc: Likewise.
* config/v850/v850.cc: Likewise.
* config/v850/v850-c.cc: Likewise.
* config/v850/v850.h: Likewise.
* config/stormy16/stormy16.cc: Likewise.
* config/stormy16/stormy16-protos.h: Likewise.
* config/stormy16/stormy16.h: Likewise.
* config/arc/arc.cc: Likewise.
* config/vxworks.cc: Likewise.
* config/microblaze/microblaze-c.cc: Likewise.
* config/microblaze/microblaze-protos.h: Likewise.
* config/microblaze/microblaze.h: Likewise.
* config/microblaze/microblaze.cc: Likewise.
* config/freebsd-spec.h: Likewise.
* config/m68k/m68kelf.h: Likewise.
* config/m68k/m68k.cc: Likewise.
* config/m68k/netbsd-elf.h: Likewise.
* config/m68k/linux.h: Likewise.
* config/freebsd.h: Likewise.
* config/host-openbsd.cc: Likewise.
* regcprop.cc: Likewise.
* dumpfile.cc: Likewise.
* combine.cc: Likewise.
* tree-ssa-forwprop.cc: Likewise.
* ipa-profile.cc: Likewise.
* hw-doloop.cc: Likewise.
* opts.cc: Likewise.
* gcc-ar.cc: Likewise.
* tree-cfg.cc: Likewise.
* incpath.cc: Likewise.
* tree-ssa-sccvn.cc: Likewise.
* function.cc: Likewise.
* genattrtab.cc: Likewise.
* rtl.def: Likewise.
* genchecksum.cc: Likewise.
* profile.cc: Likewise.
* df-core.cc: Likewise.
* tree-pretty-print.cc: Likewise.
* tree.h: Likewise.
* plugin.cc: Likewise.
* tree-ssa-loop-ch.cc: Likewise.
* emit-rtl.cc: Likewise.
* haifa-sched.cc: Likewise.
* gimple-range-edge.cc: Likewise.
* range-op.cc: Likewise.
* tree-ssa-ccp.cc: Likewise.
* dwarf2cfi.cc: Likewise.
* recog.cc: Likewise.
* vtable-verify.cc: Likewise.
* system.h: Likewise.
* regrename.cc: Likewise.
* tree-ssa-dom.cc: Likewise.
* loop-unroll.cc: Likewise.
* lra-constraints.cc: Likewise.
* pretty-print.cc: Likewise.
* ifcvt.cc: Likewise.
* ipa.cc: Likewise.
* alloc-pool.h: Likewise.
* collect2.cc: Likewise.
* pointer-query.cc: Likewise.
* cfgloop.cc: Likewise.
* toplev.cc: Likewise.
* sese.cc: Likewise.
* gengtype.cc: Likewise.
* gimplify-me.cc: Likewise.
* double-int.cc: Likewise.
* bb-reorder.cc: Likewise.
* dwarf2out.cc: Likewise.
* tree-ssa-loop-ivcanon.cc: Likewise.
* tree-ssa-reassoc.cc: Likewise.
* cgraph.cc: Likewise.
* sel-sched.cc: Likewise.
* attribs.cc: Likewise.
* expr.cc: Likewise.
* tree-ssa-scopedtables.h: Likewise.
* gimple-range-cache.cc: Likewise.
* ipa-pure-const.cc: Likewise.
* tree-inline.cc: Likewise.
* genhooks.cc: Likewise.
* gimple-range-phi.h: Likewise.
* shrink-wrap.cc: Likewise.
* tree.cc: Likewise.
* gimple.cc: Likewise.
* backend.h: Likewise.
* opts-common.cc: Likewise.
* cfg-flags.def: Likewise.
* gcse-common.cc: Likewise.
* tree-ssa-scopedtables.cc: Likewise.
* ccmp.cc: Likewise.
* builtins.def: Likewise.
* builtin-attrs.def: Likewise.
* postreload.cc: Likewise.
* sched-deps.cc: Likewise.
* ipa-inline-transform.cc: Likewise.
* tree-vect-generic.cc: Likewise.
* ipa-polymorphic-call.cc: Likewise.
* builtins.cc: Likewise.
* sel-sched-ir.cc: Likewise.
* trans-mem.cc: Likewise.
* ipa-visibility.cc: Likewise.
* cgraph.h: Likewise.
* tree-ssa-phiopt.cc: Likewise.
* genopinit.cc: Likewise.
* ipa-inline.cc: Likewise.
* omp-low.cc: Likewise.
* ipa-utils.cc: Likewise.
* tree-ssa-math-opts.cc: Likewise.
* tree-ssa-ifcombine.cc: Likewise.
* gimple-range.cc: Likewise.
* ipa-fnsummary.cc: Likewise.
* ira-color.cc: Likewise.
* value-prof.cc: Likewise.
* varasm.cc: Likewise.
* ipa-icf.cc: Likewise.
* ira-emit.cc: Likewise.
* lto-streamer.h: Likewise.
* lto-wrapper.cc: Likewise.
* regs.h: Likewise.
* gengtype-parse.cc: Likewise.
* alias.cc: Likewise.
* lto-streamer.cc: Likewise.
* real.h: Likewise.
* wide-int.h: Likewise.
* targhooks.cc: Likewise.
* gimple-ssa-warn-access.cc: Likewise.
* real.cc: Likewise.
* ipa-reference.cc: Likewise.
* bitmap.h: Likewise.
* ginclude/float.h: Likewise.
* ginclude/stddef.h: Likewise.
* ginclude/stdarg.h: Likewise.
* ginclude/stdatomic.h: Likewise.
* optabs.h: Likewise.
* sel-sched-ir.h: Likewise.
* convert.cc: Likewise.
* cgraphunit.cc: Likewise.
* lra-remat.cc: Likewise.
* tree-if-conv.cc: Likewise.
* gcov-dump.cc: Likewise.
* tree-predcom.cc: Likewise.
* dominance.cc: Likewise.
* gimple-range-cache.h: Likewise.
* ipa-devirt.cc: Likewise.
* rtl.h: Likewise.
* ubsan.cc: Likewise.
* tree-ssa.cc: Likewise.
* ssa.h: Likewise.
* cse.cc: Likewise.
* jump.cc: Likewise.
* hwint.h: Likewise.
* caller-save.cc: Likewise.
* coretypes.h: Likewise.
* ipa-fnsummary.h: Likewise.
* tree-ssa-strlen.cc: Likewise.
* modulo-sched.cc: Likewise.
* cgraphclones.cc: Likewise.
* lto-cgraph.cc: Likewise.
* hw-doloop.h: Likewise.
* data-streamer.h: Likewise.
* compare-elim.cc: Likewise.
* profile-count.h: Likewise.
* tree-vect-loop-manip.cc: Likewise.
* ree.cc: Likewise.
* reload.cc: Likewise.
* tree-ssa-loop-split.cc: Likewise.
* tree-into-ssa.cc: Likewise.
* gcse.cc: Likewise.
* cfgloopmanip.cc: Likewise.
* df.h: Likewise.
* fold-const.cc: Likewise.
* wide-int.cc: Likewise.
* gengtype-state.cc: Likewise.
* sanitizer.def: Likewise.
* tree-ssa-sink.cc: Likewise.
* target-hooks-macros.h: Likewise.
* tree-ssa-pre.cc: Likewise.
* gimple-pretty-print.cc: Likewise.
* ipa-utils.h: Likewise.
* tree-outof-ssa.cc: Likewise.
* tree-ssa-coalesce.cc: Likewise.
* gimple-match.h: Likewise.
* tree-ssa-loop-niter.cc: Likewise.
* tree-loop-distribution.cc: Likewise.
* tree-emutls.cc: Likewise.
* tree-eh.cc: Likewise.
* varpool.cc: Likewise.
* ssa-iterators.h: Likewise.
* asan.cc: Likewise.
* reload1.cc: Likewise.
* cfgloopanal.cc: Likewise.
* tree-vectorizer.cc: Likewise.
* simplify-rtx.cc: Likewise.
* opts-global.cc: Likewise.
* gimple-ssa-store-merging.cc: Likewise.
* expmed.cc: Likewise.
* tree-ssa-loop-prefetch.cc: Likewise.
* tree-ssa-dse.h: Likewise.
* tree-vect-stmts.cc: Likewise.
* gimple-fold.cc: Likewise.
* lra-coalesce.cc: Likewise.
* data-streamer-out.cc: Likewise.
* diagnostic.cc: Likewise.
* tree-ssa-alias.cc: Likewise.
* tree-vect-patterns.cc: Likewise.
* common/common-target.def: Likewise.
* common/config/rx/rx-common.cc: Likewise.
* common/config/msp430/msp430-common.cc: Likewise.
* common/config/avr/avr-common.cc: Likewise.
* common/config/i386/i386-common.cc: Likewise.
* common/config/pdp11/pdp11-common.cc: Likewise.
* common/config/rs6000/rs6000-common.cc: Likewise.
* common/config/mcore/mcore-common.cc: Likewise.
* graphite.cc: Likewise.
* gimple-low.cc: Likewise.
* genmodes.cc: Likewise.
* gimple-loop-jam.cc: Likewise.
* lto-streamer-out.cc: Likewise.
* predict.cc: Likewise.
* omp-expand.cc: Likewise.
* gimple-array-bounds.cc: Likewise.
* predict.def: Likewise.
* opts.h: Likewise.
* tree-stdarg.cc: Likewise.
* gimplify.cc: Likewise.
* ira-lives.cc: Likewise.
* loop-doloop.cc: Likewise.
* lra.cc: Likewise.
* gimple-iterator.h: Likewise.
* tree-sra.cc: Likewise.
gcc/fortran/
* trans-openmp.cc: Remove trailing whitespace.
* trans-common.cc: Likewise.
* match.h: Likewise.
* scanner.cc: Likewise.
* gfortranspec.cc: Likewise.
* io.cc: Likewise.
* iso-c-binding.def: Likewise.
* iso-fortran-env.def: Likewise.
* types.def: Likewise.
* openmp.cc: Likewise.
* f95-lang.cc: Likewise.
gcc/analyzer/
* state-purge.cc: Remove trailing whitespace.
* region-model.h: Likewise.
* region-model.cc: Likewise.
* program-point.cc: Likewise.
* exploded-graph.h: Likewise.
* program-state.cc: Likewise.
* supergraph.cc: Likewise.
gcc/c-family/
* c-ubsan.cc: Remove trailing whitespace.
* stub-objc.cc: Likewise.
* c-pragma.cc: Likewise.
* c-ppoutput.cc: Likewise.
* c-indentation.cc: Likewise.
* c-ada-spec.cc: Likewise.
* c-opts.cc: Likewise.
* c-common.cc: Likewise.
* c-format.cc: Likewise.
* c-omp.cc: Likewise.
* c-objc.h: Likewise.
* c-cppbuiltin.cc: Likewise.
* c-attribs.cc: Likewise.
* c-target.def: Likewise.
* c-common.h: Likewise.
gcc/c/
* c-typeck.cc: Remove trailing whitespace.
* gimple-parser.cc: Likewise.
* c-parser.cc: Likewise.
* c-decl.cc: Likewise.
gcc/cp/
* vtable-class-hierarchy.cc: Remove trailing whitespace.
* typeck2.cc: Likewise.
* decl.cc: Likewise.
* init.cc: Likewise.
* semantics.cc: Likewise.
* module.cc: Likewise.
* rtti.cc: Likewise.
* cxx-pretty-print.cc: Likewise.
* cvt.cc: Likewise.
* mangle.cc: Likewise.
* name-lookup.h: Likewise.
* coroutines.cc: Likewise.
* error.cc: Likewise.
* lambda.cc: Likewise.
* tree.cc: Likewise.
* g++spec.cc: Likewise.
* decl2.cc: Likewise.
* cp-tree.h: Likewise.
* parser.cc: Likewise.
* pt.cc: Likewise.
* call.cc: Likewise.
* lex.cc: Likewise.
* cp-lang.cc: Likewise.
* cp-tree.def: Likewise.
* constexpr.cc: Likewise.
* typeck.cc: Likewise.
* name-lookup.cc: Likewise.
* optimize.cc: Likewise.
* search.cc: Likewise.
* mapper-client.cc: Likewise.
* ptree.cc: Likewise.
* class.cc: Likewise.
gcc/jit/
* docs/examples/tut04-toyvm/toyvm.cc: Remove trailing whitespace.
gcc/lto/
* lto-object.cc: Remove trailing whitespace.
* lto-symtab.cc: Likewise.
* lto-partition.cc: Likewise.
* lang-specs.h: Likewise.
* lto-lang.cc: Likewise.
gcc/objc/
* objc-encoding.cc: Remove trailing whitespace.
* objc-map.h: Likewise.
* objc-next-runtime-abi-01.cc: Likewise.
* objc-act.cc: Likewise.
* objc-map.cc: Likewise.
gcc/objcp/
* objcp-decl.cc: Remove trailing whitespace.
* objcp-lang.cc: Likewise.
* objcp-decl.h: Likewise.
gcc/rust/
* util/optional.h: Remove trailing whitespace.
* util/expected.h: Likewise.
* util/rust-unicode-data.h: Likewise.
gcc/m2/
* mc-boot/GFpuIO.cc: Remove trailing whitespace.
* mc-boot/GFIO.cc: Likewise.
* mc-boot/GFormatStrings.cc: Likewise.
* mc-boot/GCmdArgs.cc: Likewise.
* mc-boot/GDebug.h: Likewise.
* mc-boot/GM2Dependent.cc: Likewise.
* mc-boot/GRTint.cc: Likewise.
* mc-boot/GDebug.cc: Likewise.
* mc-boot/GmcError.cc: Likewise.
* mc-boot/Gmcp4.cc: Likewise.
* mc-boot/GM2RTS.cc: Likewise.
* mc-boot/GIO.cc: Likewise.
* mc-boot/Gmcp5.cc: Likewise.
* mc-boot/GDynamicStrings.cc: Likewise.
* mc-boot/Gmcp1.cc: Likewise.
* mc-boot/GFormatStrings.h: Likewise.
* mc-boot/Gmcp2.cc: Likewise.
* mc-boot/Gmcp3.cc: Likewise.
* pge-boot/GFIO.cc: Likewise.
* pge-boot/GDebug.h: Likewise.
* pge-boot/GM2Dependent.cc: Likewise.
* pge-boot/GDebug.cc: Likewise.
* pge-boot/GM2RTS.cc: Likewise.
* pge-boot/GSymbolKey.cc: Likewise.
* pge-boot/GIO.cc: Likewise.
* pge-boot/GIndexing.cc: Likewise.
* pge-boot/GDynamicStrings.cc: Likewise.
* pge-boot/GFormatStrings.h: Likewise.
gcc/go/
* go-gcc.cc: Remove trailing whitespace.
* gospec.cc: Likewise.
|
|
My forthcoming patches for PR other/116613 make much more use of
cloning of pretty_printers than before, so it makes sense as a
preliminary patch for the result of pretty_printer::clone to be a
std::unique_ptr, rather than add more manual uses of "delete".
On doing so, I noticed various other places where naked new/delete is
used for run-time configuration of diagnostics:
* the output format (text vs SARIF)
* client data hooks
* the option manager
* the URLifier
Hence this patch also makes use of std::unique_ptr and ::make_unique for
managing such client policy classes, and also for diagnostic_buffer's
per-format implementations.
Unfortunately we can't directly include <memory> in our internal headers
but instead any of our TUs that make use of std::unique_ptr must #define
INCLUDE_MEMORY before including system.h.
Hence the bulk of this patch is taken up with adding a define of
INCLUDE_MEMORY to hundreds of source files: everything that includes
diagnostic.h or pretty-print.h (and thus anything transitively such as
includers of lto-wrapper.h, c-tree.h, cp-tree.h and rtl-ssa.h).
Thanks to Gaius Mulley for the parts of the patch that regenerated the
m2 files.
gcc/ada/ChangeLog:
PR other/116613
* gcc-interface/misc.cc: Add #define INCLUDE_MEMORY
* gcc-interface/trans.cc: Likewise.
* gcc-interface/utils.cc: Likewise.
gcc/analyzer/ChangeLog:
PR other/116613
* analyzer-logging.cc: Add #define INCLUDE_MEMORY
(logger::logger): Update for m_pp becoming a unique_ptr.
(logger::~logger): Likewise.
(logger::log_va_partial): Likewise.
(logger::end_log_line): Likewise.
* analyzer-logging.h (logger::get_printer): Likewise.
(logger::m_pp): Convert to a unique_ptr.
* analyzer.cc (make_label_text): Use
diagnostic_context::clone_printer and use unique_ptr.
(make_label_text_n): Likewise.
* bar-chart.cc: Add #define INCLUDE_MEMORY
* pending-diagnostic.cc (evdesc::event_desc::formatted_print):
Use diagnostic_context::clone_printer and use unique_ptr.
* sm-malloc.cc (sufficiently_similar_p): Likewise.
* supergraph.cc (supergraph::dump_dot_to_file): Likewise.
gcc/c-family/ChangeLog:
PR other/116613
* c-ada-spec.cc: Add #define INCLUDE_MEMORY.
* c-attribs.cc: Likewise.
* c-common.cc: Likewise.
* c-format.cc: Likewise.
* c-gimplify.cc: Likewise.
* c-indentation.cc: Likewise.
* c-opts.cc: Likewise.
* c-pch.cc: Likewise.
* c-pragma.cc: Likewise.
* c-pretty-print.cc: Likewise. Add #include "make-unique.h".
(c_pretty_printer::clone): Use std::unique_ptr and ::make_unique.
* c-pretty-print.h (c_pretty_printer::clone): Use std::unique_ptr.
* c-type-mismatch.cc: Add #define INCLUDE_MEMORY.
* c-warn.cc: Likewise.
gcc/c/ChangeLog:
PR other/116613
* c-aux-info.cc: Add #define INCLUDE_MEMORY.
* c-convert.cc: Likewise.
* c-errors.cc: Likewise.
* c-fold.cc: Likewise.
* c-lang.cc: Likewise.
* c-objc-common.cc: Likewise.
(pp_markup::element_quoted_type::print_type): Use unique_ptr.
* c-typeck.cc: Add #define INCLUDE_MEMORY.
* gimple-parser.cc: Likewise.
gcc/cp/ChangeLog:
PR other/116613
* call.cc: Add #define INCLUDE_MEMORY.
* class.cc: Likewise.
* constexpr.cc: Likewise.
* constraint.cc: Likewise.
* contracts.cc: Likewise.
* coroutines.cc: Likewise.
* cp-gimplify.cc: Likewise.
* cp-lang.cc: Likewise.
* cp-objcp-common.cc: Likewise.
* cp-ubsan.cc: Likewise.
* cvt.cc: Likewise.
* cxx-pretty-print.cc: Likewise. Add #include "cp-tree.h".
(cxx_pretty_printer::clone): Use std::unique_ptr and
::make_unique.
* cxx-pretty-print.h (cxx_pretty_printer::clone): Use
std::unique_ptr.
* decl2.cc: Add #define INCLUDE_MEMORY.
* dump.cc: Likewise.
* except.cc: Likewise.
* expr.cc: Likewise.
* friend.cc: Likewise.
* init.cc: Likewise.
* lambda.cc: Likewise.
* logic.cc: Likewise.
* mangle.cc: Likewise.
* method.cc: Likewise.
* optimize.cc: Likewise.
* pt.cc: Likewise.
* ptree.cc: Likewise.
* rtti.cc: Likewise.
* search.cc: Likewise.
* semantics.cc: Likewise.
* tree.cc: Likewise.
* typeck.cc: Likewise.
* typeck2.cc: Likewise.
* vtable-class-hierarchy.cc: Likewise.
gcc/d/ChangeLog:
PR other/116613
* d-attribs.cc: Add #define INCLUDE_MEMORY.
* d-builtins.cc: Likewise.
* d-codegen.cc: Likewise.
* d-convert.cc: Likewise.
* d-diagnostic.cc: Likewise.
* d-frontend.cc: Likewise.
* d-lang.cc: Likewise.
* d-longdouble.cc: Likewise.
* d-target.cc: Likewise.
* decl.cc: Likewise.
* expr.cc: Likewise.
* intrinsics.cc: Likewise.
* modules.cc: Likewise.
* toir.cc: Likewise.
* typeinfo.cc: Likewise.
* types.cc: Likewise.
gcc/fortran/ChangeLog:
PR other/116613
* arith.cc: Add #define INCLUDE_MEMORY.
* array.cc: Likewise.
* bbt.cc: Likewise.
* check.cc: Likewise.
* class.cc: Likewise.
* constructor.cc: Likewise.
* convert.cc: Likewise.
* cpp.cc: Likewise.
* data.cc: Likewise.
* decl.cc: Likewise.
* dependency.cc: Likewise.
* dump-parse-tree.cc: Likewise.
* error.cc: Likewise.
* expr.cc: Likewise.
* f95-lang.cc: Likewise.
* frontend-passes.cc: Likewise.
* interface.cc: Likewise.
* intrinsic.cc: Likewise.
* io.cc: Likewise.
* iresolve.cc: Likewise.
* match.cc: Likewise.
* matchexp.cc: Likewise.
* misc.cc: Likewise.
* module.cc: Likewise.
* openmp.cc: Likewise.
* options.cc: Likewise.
* parse.cc: Likewise.
* primary.cc: Likewise.
* resolve.cc: Likewise.
* scanner.cc: Likewise.
* simplify.cc: Likewise.
* st.cc: Likewise.
* symbol.cc: Likewise.
* target-memory.cc: Likewise.
* trans-array.cc: Likewise.
* trans-common.cc: Likewise.
* trans-const.cc: Likewise.
* trans-decl.cc: Likewise.
* trans-expr.cc: Likewise.
* trans-intrinsic.cc: Likewise.
* trans-io.cc: Likewise.
* trans-openmp.cc: Likewise.
* trans-stmt.cc: Likewise.
* trans-types.cc: Likewise.
* trans.cc: Likewise.
gcc/go/ChangeLog:
PR other/116613
* go-backend.cc: Add #define INCLUDE_MEMORY.
* go-lang.cc: Likewise.
gcc/jit/ChangeLog:
PR other/116613
* dummy-frontend.cc: Add #define INCLUDE_MEMORY.
* jit-playback.cc: Likewise.
* jit-recording.cc: Likewise.
gcc/lto/ChangeLog:
PR other/116613
* lto-common.cc: Add #define INCLUDE_MEMORY.
* lto-dump.cc: Likewise.
* lto-partition.cc: Likewise.
* lto-symtab.cc: Likewise.
* lto.cc: Likewise.
gcc/m2/ChangeLog:
PR other/116613
* gm2-gcc/gcc-consolidation.h: Add #define INCLUDE_MEMORY.
* gm2-gcc/m2configure.cc: Likewise.
* mc-boot/GASCII.cc: Regenerate.
* mc-boot/GASCII.h: Ditto.
* mc-boot/GArgs.cc: Ditto.
* mc-boot/GArgs.h: Ditto.
* mc-boot/GAssertion.cc: Ditto.
* mc-boot/GAssertion.h: Ditto.
* mc-boot/GBreak.cc: Ditto.
* mc-boot/GBreak.h: Ditto.
* mc-boot/GCOROUTINES.h: Ditto.
* mc-boot/GCmdArgs.cc: Ditto.
* mc-boot/GCmdArgs.h: Ditto.
* mc-boot/GDebug.cc: Ditto.
* mc-boot/GDebug.h: Ditto.
* mc-boot/GDynamicStrings.cc: Ditto.
* mc-boot/GDynamicStrings.h: Ditto.
* mc-boot/GEnvironment.cc: Ditto.
* mc-boot/GEnvironment.h: Ditto.
* mc-boot/GFIO.cc: Ditto.
* mc-boot/GFIO.h: Ditto.
* mc-boot/GFormatStrings.cc: Ditto.
* mc-boot/GFormatStrings.h: Ditto.
* mc-boot/GFpuIO.cc: Ditto.
* mc-boot/GFpuIO.h: Ditto.
* mc-boot/GIO.cc: Ditto.
* mc-boot/GIO.h: Ditto.
* mc-boot/GIndexing.cc: Ditto.
* mc-boot/GIndexing.h: Ditto.
* mc-boot/GM2Dependent.cc: Ditto.
* mc-boot/GM2Dependent.h: Ditto.
* mc-boot/GM2EXCEPTION.cc: Ditto.
* mc-boot/GM2EXCEPTION.h: Ditto.
* mc-boot/GM2RTS.cc: Ditto.
* mc-boot/GM2RTS.h: Ditto.
* mc-boot/GMemUtils.cc: Ditto.
* mc-boot/GMemUtils.h: Ditto.
* mc-boot/GNumberIO.cc: Ditto.
* mc-boot/GNumberIO.h: Ditto.
* mc-boot/GPushBackInput.cc: Ditto.
* mc-boot/GPushBackInput.h: Ditto.
* mc-boot/GRTExceptions.cc: Ditto.
* mc-boot/GRTExceptions.h: Ditto.
* mc-boot/GRTco.h: Ditto.
* mc-boot/GRTentity.h: Ditto.
* mc-boot/GRTint.cc: Ditto.
* mc-boot/GRTint.h: Ditto.
* mc-boot/GSArgs.cc: Ditto.
* mc-boot/GSArgs.h: Ditto.
* mc-boot/GSFIO.cc: Ditto.
* mc-boot/GSFIO.h: Ditto.
* mc-boot/GSYSTEM.h: Ditto.
* mc-boot/GSelective.h: Ditto.
* mc-boot/GStdIO.cc: Ditto.
* mc-boot/GStdIO.h: Ditto.
* mc-boot/GStorage.cc: Ditto.
* mc-boot/GStorage.h: Ditto.
* mc-boot/GStrCase.cc: Ditto.
* mc-boot/GStrCase.h: Ditto.
* mc-boot/GStrIO.cc: Ditto.
* mc-boot/GStrIO.h: Ditto.
* mc-boot/GStrLib.cc: Ditto.
* mc-boot/GStrLib.h: Ditto.
* mc-boot/GStringConvert.cc: Ditto.
* mc-boot/GStringConvert.h: Ditto.
* mc-boot/GSysExceptions.h: Ditto.
* mc-boot/GSysStorage.cc: Ditto.
* mc-boot/GSysStorage.h: Ditto.
* mc-boot/GTimeString.cc: Ditto.
* mc-boot/GTimeString.h: Ditto.
* mc-boot/GUnixArgs.h: Ditto.
* mc-boot/Galists.cc: Ditto.
* mc-boot/Galists.h: Ditto.
* mc-boot/Gdecl.cc: Ditto.
* mc-boot/Gdecl.h: Ditto.
* mc-boot/Gdtoa.h: Ditto.
* mc-boot/Gerrno.h: Ditto.
* mc-boot/Gkeyc.cc: Ditto.
* mc-boot/Gkeyc.h: Ditto.
* mc-boot/Gldtoa.h: Ditto.
* mc-boot/Glibc.h: Ditto.
* mc-boot/Glibm.h: Ditto.
* mc-boot/Glists.cc: Ditto.
* mc-boot/Glists.h: Ditto.
* mc-boot/GmcComment.cc: Ditto.
* mc-boot/GmcComment.h: Ditto.
* mc-boot/GmcComp.cc: Ditto.
* mc-boot/GmcComp.h: Ditto.
* mc-boot/GmcDebug.cc: Ditto.
* mc-boot/GmcDebug.h: Ditto.
* mc-boot/GmcError.cc: Ditto.
* mc-boot/GmcError.h: Ditto.
* mc-boot/GmcFileName.cc: Ditto.
* mc-boot/GmcFileName.h: Ditto.
* mc-boot/GmcLexBuf.cc: Ditto.
* mc-boot/GmcLexBuf.h: Ditto.
* mc-boot/GmcMetaError.cc: Ditto.
* mc-boot/GmcMetaError.h: Ditto.
* mc-boot/GmcOptions.cc: Ditto.
* mc-boot/GmcOptions.h: Ditto.
* mc-boot/GmcPreprocess.cc: Ditto.
* mc-boot/GmcPreprocess.h: Ditto.
* mc-boot/GmcPretty.cc: Ditto.
* mc-boot/GmcPretty.h: Ditto.
* mc-boot/GmcPrintf.cc: Ditto.
* mc-boot/GmcPrintf.h: Ditto.
* mc-boot/GmcQuiet.cc: Ditto.
* mc-boot/GmcQuiet.h: Ditto.
* mc-boot/GmcReserved.cc: Ditto.
* mc-boot/GmcReserved.h: Ditto.
* mc-boot/GmcSearch.cc: Ditto.
* mc-boot/GmcSearch.h: Ditto.
* mc-boot/GmcStack.cc: Ditto.
* mc-boot/GmcStack.h: Ditto.
* mc-boot/GmcStream.cc: Ditto.
* mc-boot/GmcStream.h: Ditto.
* mc-boot/Gmcflex.h: Ditto.
* mc-boot/Gmcp1.cc: Ditto.
* mc-boot/Gmcp1.h: Ditto.
* mc-boot/Gmcp2.cc: Ditto.
* mc-boot/Gmcp2.h: Ditto.
* mc-boot/Gmcp3.cc: Ditto.
* mc-boot/Gmcp3.h: Ditto.
* mc-boot/Gmcp4.cc: Ditto.
* mc-boot/Gmcp4.h: Ditto.
* mc-boot/Gmcp5.cc: Ditto.
* mc-boot/Gmcp5.h: Ditto.
* mc-boot/GnameKey.cc: Ditto.
* mc-boot/GnameKey.h: Ditto.
* mc-boot/GsymbolKey.cc: Ditto.
* mc-boot/GsymbolKey.h: Ditto.
* mc-boot/Gtermios.h: Ditto.
* mc-boot/Gtop.cc: Ditto.
* mc-boot/Gvarargs.cc: Ditto.
* mc-boot/Gvarargs.h: Ditto.
* mc-boot/Gwlists.cc: Ditto.
* mc-boot/Gwlists.h: Ditto.
* mc-boot/Gwrapc.h: Ditto.
* mc/keyc.mod (checkGccConfigSystem): Add
#define INCLUDE_MEMORY.
* pge-boot/GASCII.cc: Regenerate.
* pge-boot/GASCII.h: Ditto.
* pge-boot/GArgs.cc: Ditto.
* pge-boot/GArgs.h: Ditto.
* pge-boot/GAssertion.cc: Ditto.
* pge-boot/GAssertion.h: Ditto.
* pge-boot/GBreak.h: Ditto.
* pge-boot/GCmdArgs.h: Ditto.
* pge-boot/GDebug.cc: Ditto.
* pge-boot/GDebug.h: Ditto.
* pge-boot/GDynamicStrings.cc: Ditto.
* pge-boot/GDynamicStrings.h: Ditto.
* pge-boot/GEnvironment.h: Ditto.
* pge-boot/GFIO.cc: Ditto.
* pge-boot/GFIO.h: Ditto.
* pge-boot/GFormatStrings.h: Ditto.
* pge-boot/GFpuIO.h: Ditto.
* pge-boot/GIO.cc: Ditto.
* pge-boot/GIO.h: Ditto.
* pge-boot/GIndexing.cc: Ditto.
* pge-boot/GIndexing.h: Ditto.
* pge-boot/GLists.cc: Ditto.
* pge-boot/GLists.h: Ditto.
* pge-boot/GM2Dependent.cc: Ditto.
* pge-boot/GM2Dependent.h: Ditto.
* pge-boot/GM2EXCEPTION.cc: Ditto.
* pge-boot/GM2EXCEPTION.h: Ditto.
* pge-boot/GM2RTS.cc: Ditto.
* pge-boot/GM2RTS.h: Ditto.
* pge-boot/GNameKey.cc: Ditto.
* pge-boot/GNameKey.h: Ditto.
* pge-boot/GNumberIO.cc: Ditto.
* pge-boot/GNumberIO.h: Ditto.
* pge-boot/GOutput.cc: Ditto.
* pge-boot/GOutput.h: Ditto.
* pge-boot/GPushBackInput.cc: Ditto.
* pge-boot/GPushBackInput.h: Ditto.
* pge-boot/GRTExceptions.cc: Ditto.
* pge-boot/GRTExceptions.h: Ditto.
* pge-boot/GSArgs.h: Ditto.
* pge-boot/GSEnvironment.h: Ditto.
* pge-boot/GSFIO.cc: Ditto.
* pge-boot/GSFIO.h: Ditto.
* pge-boot/GSYSTEM.h: Ditto.
* pge-boot/GScan.h: Ditto.
* pge-boot/GStdIO.cc: Ditto.
* pge-boot/GStdIO.h: Ditto.
* pge-boot/GStorage.cc: Ditto.
* pge-boot/GStorage.h: Ditto.
* pge-boot/GStrCase.cc: Ditto.
* pge-boot/GStrCase.h: Ditto.
* pge-boot/GStrIO.cc: Ditto.
* pge-boot/GStrIO.h: Ditto.
* pge-boot/GStrLib.cc: Ditto.
* pge-boot/GStrLib.h: Ditto.
* pge-boot/GStringConvert.h: Ditto.
* pge-boot/GSymbolKey.cc: Ditto.
* pge-boot/GSymbolKey.h: Ditto.
* pge-boot/GSysExceptions.h: Ditto.
* pge-boot/GSysStorage.cc: Ditto.
* pge-boot/GSysStorage.h: Ditto.
* pge-boot/GTimeString.h: Ditto.
* pge-boot/GUnixArgs.h: Ditto.
* pge-boot/Gbnflex.cc: Ditto.
* pge-boot/Gbnflex.h: Ditto.
* pge-boot/Gdtoa.h: Ditto.
* pge-boot/Gerrno.h: Ditto.
* pge-boot/Gldtoa.h: Ditto.
* pge-boot/Glibc.h: Ditto.
* pge-boot/Glibm.h: Ditto.
* pge-boot/Gpge.cc: Ditto.
* pge-boot/Gtermios.h: Ditto.
* pge-boot/Gwrapc.h: Ditto.
gcc/objc/ChangeLog:
PR other/116613
* objc-act.cc: Add #define INCLUDE_MEMORY.
* objc-encoding.cc: Likewise.
* objc-gnu-runtime-abi-01.cc: Likewise.
* objc-lang.cc: Likewise.
* objc-next-runtime-abi-01.cc: Likewise.
* objc-next-runtime-abi-02.cc: Likewise.
* objc-runtime-shared-support.cc: Likewise.
gcc/objcp/ChangeLog:: Add #define INCLUDE_MEMORY.
PR other/116613
* objcp-decl.cc
* objcp-lang.cc: Likewise.
gcc/rust/ChangeLog:
PR other/116613
* resolve/rust-ast-resolve-expr.cc: Add #define INCLUDE_MEMORY.
* rust-attribs.cc: Likewise.
* rust-system.h: Likewise.
gcc/ChangeLog:
PR other/116613
* asan.cc: Add #define INCLUDE_MEMORY.
* attribs.cc: Likewise.
(attr_access::array_as_string): Use
diagnostic_context::clone_printer and use unique_ptr.
* auto-profile.cc: Add #define INCLUDE_MEMORY.
* calls.cc: Likewise.
* cfganal.cc: Likewise.
* cfgexpand.cc: Likewise.
* cfghooks.cc: Likewise.
* cfgloop.cc: Likewise.
* cgraph.cc: Likewise.
* cgraphclones.cc: Likewise.
* cgraphunit.cc: Likewise.
* collect-utils.cc: Likewise.
* collect2.cc: Likewise.
* common/config/aarch64/aarch64-common.cc: Likewise.
* common/config/arm/arm-common.cc: Likewise.
* common/config/avr/avr-common.cc: Likewise.
* config/aarch64/aarch64-cc-fusion.cc: Likewise.
* config/aarch64/aarch64-early-ra.cc: Likewise.
* config/aarch64/aarch64-sve-builtins.cc: Likewise.
* config/arc/arc.cc: Likewise.
* config/arm/aarch-common.cc: Likewise.
* config/arm/arm-mve-builtins.cc: Likewise.
* config/avr/avr-devices.cc: Likewise.
* config/avr/driver-avr.cc: Likewise.
* config/bpf/bpf.cc: Likewise.
* config/bpf/btfext-out.cc: Likewise.
* config/bpf/core-builtins.cc: Likewise.
* config/darwin.cc: Likewise.
* config/i386/driver-i386.cc: Likewise.
* config/i386/i386-builtins.cc: Likewise.
* config/i386/i386-expand.cc: Likewise.
* config/i386/i386-features.cc: Likewise.
* config/i386/i386-options.cc: Likewise.
* config/loongarch/loongarch-builtins.cc: Likewise.
* config/mingw/winnt-cxx.cc: Likewise.
* config/mingw/winnt.cc: Likewise.
* config/mips/mips.cc: Likewise.
* config/msp430/driver-msp430.cc: Likewise.
* config/nvptx/mkoffload.cc: Likewise.
* config/nvptx/nvptx.cc: Likewise.
* config/riscv/riscv-avlprop.cc: Likewise.
* config/riscv/riscv-vector-builtins.cc: Likewise.
* config/riscv/riscv-vsetvl.cc: Likewise.
* config/rs6000/driver-rs6000.cc: Likewise.
* config/rs6000/host-darwin.cc: Likewise.
* config/rs6000/rs6000-c.cc: Likewise.
* config/s390/s390-c.cc: Likewise.
* config/s390/s390.cc: Likewise.
* config/sol2-cxx.cc: Likewise.
* config/vms/vms-c.cc: Likewise.
* config/xtensa/xtensa-dynconfig.cc: Likewise.
* coroutine-passes.cc: Likewise.
* coverage.cc: Likewise.
* data-streamer-in.cc: Likewise.
* data-streamer-out.cc: Likewise.
* data-streamer.cc: Likewise.
* diagnostic-buffer.h (diagnostic_buffer::~diagnostic_buffer):
Delete.
(diagnostic_buffer::m_per_format_buffer): Use std::unique_ptr.
* diagnostic-client-data-hooks.h (make_compiler_data_hooks): Use
std::unique_ptr for return type.
* diagnostic-format-json.cc
(json_output_format::make_per_format_buffer): Likewise.
(diagnostic_output_format_init_json): Update for usage of
std::unique_ptr in set_output_format.
* diagnostic-format-sarif.cc
(sarif_output_format::make_per_format_buffer): Use std::unique_ptr
for return type.
(diagnostic_output_format_init_sarif): Update for usage of
std::unique_ptr.
(test_message_with_embedded_link): Likewise for set_urlifier.
* diagnostic-format-text.cc: Add #define INCLUDE_MEMORY. Include
"make-unique.h".
(diagnostic_text_output_format::set_buffer): Use std::unique_ptr.
* diagnostic-format-text.h
(diagnostic_text_output_format::set_buffer): Likewise.
* diagnostic-format.h
(diagnostic_output_format::make_per_format_buffer): Likewise.
* diagnostic-global-context.cc:
* diagnostic-macro-unwinding.cc: Likewise.
* diagnostic-show-locus.cc: Likewise.
* diagnostic-spec.cc: Likewise.
* diagnostic.cc (diagnostic_context::set_output_format): Use
std::unique_ptr for input.
(diagnostic_context::set_client_data_hooks): Likewise.
(diagnostic_context::set_option_manager): Likewise.
(diagnostic_context::set_urlifier): Likewise.
(diagnostic_context::set_diagnostic_buffer): Update for use of
std::unique_ptr.
(diagnostic_buffer::diagnostic_buffer): Likewise.
(diagnostic_buffer::~diagnostic_buffer): Delete.
* diagnostic.h: Complain if INCLUDE_MEMORY was not defined.
(diagnostic_context::set_output_format): Use std::unique_ptr for
input.
(diagnostic_context::set_client_data_hooks): Likewise.
(diagnostic_context::set_option_manager): Likewise.
(diagnostic_context::set_urlifier): Likewise.
(diagnostic_context::clone_printer): New.
(diagnostic_context::m_printer): Update comment.
(diagnostic_context::m_option_mgr): Likewise.
(diagnostic_context::m_urlifier): Likewise.
(diagnostic_context::m_edit_context_ptr): Likewise.
(diagnostic_context::m_output_format): Likewise.
(diagnostic_context::m_client_data_hooks): Likewise.
(diagnostic_context::m_theme): Likewise.
* digraph.cc: Add #define INCLUDE_MEMORY.
* dwarf2out.cc: Likewise.
* edit-context.cc: Likewise.
* except.cc: Likewise.
* expr.cc: Likewise.
* file-prefix-map.cc: Likewise.
* final.cc: Likewise.
* fwprop.cc: Likewise.
* gcc-plugin.h: Likewise.
* gcc-rich-location.cc: Likewise.
* gcc-urlifier.cc: Likewise. Add #include "make-unique.h".
(make_gcc_urlifier): Use std::unique_ptr and ::make_unique.
* gcc-urlifier.h (make_gcc_urlifier): Use std::unique_ptr.
* gcc.cc: Add #define INCLUDE_MEMORY. Include
"pretty-print-urlifier.h".
* gcov-dump.cc: Add #define INCLUDE_MEMORY.
* gcov-tool.cc: Likewise.
* gengtype.cc (open_base_files): Likewise to output.
* genmatch.cc: Likewise.
* gimple-fold.cc: Likewise.
* gimple-harden-conditionals.cc: Likewise.
* gimple-harden-control-flow.cc: Likewise.
* gimple-if-to-switch.cc: Likewise.
* gimple-lower-bitint.cc: Likewise.
* gimple-predicate-analysis.cc: Likewise.
* gimple-pretty-print.cc: Likewise.
* gimple-range-cache.cc: Likewise.
* gimple-range-edge.cc: Likewise.
* gimple-range-fold.cc: Likewise.
* gimple-range-gori.cc: Likewise.
* gimple-range-infer.cc: Likewise.
* gimple-range-op.cc: Likewise.
* gimple-range-path.cc: Likewise.
* gimple-range-phi.cc: Likewise.
* gimple-range-trace.cc: Likewise.
* gimple-range.cc: Likewise.
* gimple-ssa-backprop.cc: Likewise.
* gimple-ssa-sprintf.cc: Likewise.
* gimple-ssa-store-merging.cc: Likewise.
* gimple-ssa-strength-reduction.cc: Likewise.
* gimple-ssa-warn-access.cc: Likewise.
* gimple-ssa-warn-alloca.cc: Likewise.
* gimple-ssa-warn-restrict.cc: Likewise.
* gimple-streamer-in.cc: Likewise.
* gimple-streamer-out.cc: Likewise.
* gimple.cc: Likewise.
* gimplify.cc: Likewise.
* graph.cc: Likewise.
* graphviz.cc: Likewise.
* input.cc: Likewise.
* ipa-cp.cc: Likewise.
* ipa-devirt.cc: Likewise.
* ipa-fnsummary.cc: Likewise.
* ipa-free-lang-data.cc: Likewise.
* ipa-icf-gimple.cc: Likewise.
* ipa-icf.cc: Likewise.
* ipa-inline-analysis.cc: Likewise.
* ipa-inline.cc: Likewise.
* ipa-modref-tree.cc: Likewise.
* ipa-modref.cc: Likewise.
* ipa-param-manipulation.cc: Likewise.
* ipa-polymorphic-call.cc: Likewise.
* ipa-predicate.cc: Likewise.
* ipa-profile.cc: Likewise.
* ipa-prop.cc: Likewise.
* ipa-pure-const.cc: Likewise.
* ipa-reference.cc: Likewise.
* ipa-split.cc: Likewise.
* ipa-sra.cc: Likewise.
* ipa-strub.cc: Likewise.
* ipa-utils.cc: Likewise.
* langhooks.cc: Likewise.
* late-combine.cc: Likewise.
* lto-cgraph.cc: Likewise.
* lto-compress.cc: Likewise.
* lto-opts.cc: Likewise.
* lto-section-in.cc: Likewise.
* lto-section-out.cc: Likewise.
* lto-streamer-in.cc: Likewise.
* lto-streamer-out.cc: Likewise.
* lto-streamer.cc: Likewise.
* lto-wrapper.cc: Likewise. Include "make-unique.h".
(main): Use ::make_unique when creating option manager.
* multiple_target.cc: Likewise.
* omp-expand.cc: Likewise.
* omp-general.cc: Likewise.
* omp-low.cc: Likewise.
* omp-oacc-neuter-broadcast.cc: Likewise.
* omp-offload.cc: Likewise.
* omp-simd-clone.cc: Likewise.
* optc-gen.awk: Likewise in output.
* optc-save-gen.awk: Likewise in output.
* options-urls-cc-gen.awk: Likewise in output.
* opts-common.cc: Likewise.
* opts-global.cc: Likewise.
* opts.cc: Likewise.
* pair-fusion.cc: Likewise.
* passes.cc: Likewise.
* pointer-query.cc: Likewise.
* predict.cc: Likewise.
* pretty-print.cc (pretty_printer::clone): Use std::unique_ptr and
::make_unique.
* pretty-print.h: Complain if INCLUDE_MEMORY is not defined.
(pretty_printer::clone): Use std::unique_ptr.
* print-rtl.cc: Add #define INCLUDE_MEMORY.
* print-tree.cc: Likewise.
* profile-count.cc: Likewise.
* range-op-float.cc: Likewise.
* range-op-ptr.cc: Likewise.
* range-op.cc: Likewise.
* range.cc: Likewise.
* read-rtl-function.cc: Likewise.
* rtl-error.cc: Likewise.
* rtl-ssa/accesses.cc: Likewise.
* rtl-ssa/blocks.cc: Likewise.
* rtl-ssa/changes.cc: Likewise.
* rtl-ssa/functions.cc: Likewise.
* rtl-ssa/insns.cc: Likewise.
* rtl-ssa/movement.cc: Likewise.
* rtl-tests.cc: Likewise.
* sanopt.cc: Likewise.
* sched-rgn.cc: Likewise.
* selftest-diagnostic-path.cc: Likewise.
* selftest-diagnostic.cc: Likewise.
* splay-tree-utils.cc: Likewise.
* sreal.cc: Likewise.
* stmt.cc: Likewise.
* substring-locations.cc: Likewise.
* symtab-clones.cc: Likewise.
* symtab-thunks.cc: Likewise.
* symtab.cc: Likewise.
* text-art/box-drawing.cc: Likewise.
* text-art/canvas.cc: Likewise.
* text-art/ruler.cc: Likewise.
* text-art/selftests.cc: Likewise.
* text-art/theme.cc: Likewise.
* toplev.cc: Likewise. Include "make-unique.h".
(general_init): Use ::make_unique when setting option_manager.
* trans-mem.cc: Add #define INCLUDE_MEMORY.
* tree-affine.cc: Likewise.
* tree-call-cdce.cc: Likewise.
* tree-cfg.cc: Likewise.
* tree-chrec.cc: Likewise.
* tree-dfa.cc: Likewise.
* tree-diagnostic-client-data-hooks.cc: Include "make-unique.h".
(make_compiler_data_hooks): Use std::unique_ptr and ::make_unique.
* tree-diagnostic.cc: Add #define INCLUDE_MEMORY.
* tree-dump.cc: Likewise.
* tree-inline.cc: Likewise.
* tree-into-ssa.cc: Likewise.
* tree-logical-location.cc: Likewise.
* tree-nested.cc: Likewise.
* tree-nrv.cc: Likewise.
* tree-object-size.cc: Likewise.
* tree-outof-ssa.cc: Likewise.
* tree-pretty-print.cc: Likewise.
* tree-profile.cc: Likewise.
* tree-scalar-evolution.cc: Likewise.
* tree-sra.cc: Likewise.
* tree-ssa-address.cc: Likewise.
* tree-ssa-alias.cc: Likewise.
* tree-ssa-ccp.cc: Likewise.
* tree-ssa-coalesce.cc: Likewise.
* tree-ssa-copy.cc: Likewise.
* tree-ssa-dce.cc: Likewise.
* tree-ssa-dom.cc: Likewise.
* tree-ssa-forwprop.cc: Likewise.
* tree-ssa-ifcombine.cc: Likewise.
* tree-ssa-loop-ch.cc: Likewise.
* tree-ssa-loop-im.cc: Likewise.
* tree-ssa-loop-manip.cc: Likewise.
* tree-ssa-loop-niter.cc: Likewise.
* tree-ssa-loop-split.cc: Likewise.
* tree-ssa-math-opts.cc: Likewise.
* tree-ssa-operands.cc: Likewise.
* tree-ssa-phiprop.cc: Likewise.
* tree-ssa-pre.cc: Likewise.
* tree-ssa-propagate.cc: Likewise.
* tree-ssa-reassoc.cc: Likewise.
* tree-ssa-sccvn.cc: Likewise.
* tree-ssa-scopedtables.cc: Likewise.
* tree-ssa-sink.cc: Likewise.
* tree-ssa-strlen.cc: Likewise.
* tree-ssa-structalias.cc: Likewise.
* tree-ssa-ter.cc: Likewise.
* tree-ssa-uninit.cc: Likewise.
* tree-ssa.cc: Likewise.
* tree-ssanames.cc: Likewise.
* tree-stdarg.cc: Likewise.
* tree-streamer-in.cc: Likewise.
* tree-streamer-out.cc: Likewise.
* tree-streamer.cc: Likewise.
* tree-switch-conversion.cc: Likewise.
* tree-tailcall.cc: Likewise.
* tree-vrp.cc: Likewise.
* tree.cc: Likewise.
* ubsan.cc: Likewise.
* value-pointer-equiv.cc: Likewise.
* value-prof.cc: Likewise.
* value-query.cc: Likewise.
* value-range-pretty-print.cc: Likewise.
* value-range-storage.cc: Likewise.
* value-range.cc: Likewise.
* value-relation.cc: Likewise.
* var-tracking.cc: Likewise.
* varpool.cc: Likewise.
* vr-values.cc: Likewise.
* wide-int-print.cc: Likewise.
gcc/testsuite/ChangeLog:
PR other/116613
* gcc.dg/plugin/diagnostic_group_plugin.c: Update for use of
std::unique_ptr.
* gcc.dg/plugin/diagnostic_plugin_xhtml_format.c: Likewise.
* gcc.dg/plugin/ggcplug.c: Likewise.
libgcc/ChangeLog:
PR other/116613
* libgcov-util.c: Add #define INCLUDE_MEMORY.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Co-authored-by: Gaius Mulley <gaiusmod2@gmail.com>
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Just like what AArch64 has done.
Signed-off-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>
gcc/ChangeLog:
* config/riscv/riscv.cc (struct riscv_tune_param): Add new
tune options.
(riscv_override_options_internal): Override the default alignment
when not optimizing for size.
|
|
This patch would like to implement the sstrunc for vector signed integer.
Form 1:
#define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \
void __attribute__((noinline)) \
vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
{ \
unsigned i; \
for (i = 0; i < limit; i++) \
{ \
WT x = in[i]; \
NT trunc = (NT)x; \
out[i] = (WT)NT_MIN <= x && x <= (WT)NT_MAX \
? trunc \
: x < 0 ? NT_MIN : NT_MAX; \
} \
}
DEF_VEC_SAT_S_TRUNC_FMT_1(int32_t, int64_t, INT32_MIN, INT32_MAX)
Before this patch:
27 │ vsetvli a5,a2,e64,m1,ta,ma
28 │ vle64.v v1,0(a1)
29 │ slli a3,a5,3
30 │ slli a4,a5,2
31 │ sub a2,a2,a5
32 │ add a1,a1,a3
33 │ vadd.vv v0,v1,v5
34 │ vsetvli zero,zero,e32,mf2,ta,ma
35 │ vnsrl.wx v2,v1,a6
36 │ vncvt.x.x.w v1,v1
37 │ vsetvli zero,zero,e64,m1,ta,ma
38 │ vmsgtu.vv v0,v0,v4
39 │ vsetvli zero,zero,e32,mf2,ta,mu
40 │ vneg.v v2,v2
41 │ vxor.vv v1,v2,v3,v0.t
42 │ vse32.v v1,0(a0)
43 │ add a0,a0,a4
44 │ bne a2,zero,.L3
After this patch:
16 │ vsetvli a5,a2,e32,mf2,ta,ma
17 │ vle64.v v1,0(a1)
18 │ slli a3,a5,3
19 │ slli a4,a5,2
20 │ sub a2,a2,a5
21 │ add a1,a1,a3
22 │ vnclip.wi v1,v1,0
23 │ vse32.v v1,0(a0)
24 │ add a0,a0,a4
25 │ bne a2,zero,.L3
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/autovec.md (sstrunc<mode><v_double_trunc>2): Add
new pattern sstrunc for double trunc.
(sstrunc<mode><v_quad_trunc>2): Ditto but for quad trunc.
(sstrunc<mode><v_oct_trunc>2): Ditto but for oct trunc.
* config/riscv/riscv-protos.h (expand_vec_double_sstrunc): Add
new func decl to expand double trunc.
(expand_vec_quad_sstrunc): Ditto but for quad trunc.
(expand_vec_oct_sstrunc): Ditto but for oct trunc.
* config/riscv/riscv-v.cc (expand_vec_double_sstrunc): Add new
func to expand double trunc.
(expand_vec_quad_sstrunc): Ditto but for quad trunc.
(expand_vec_oct_sstrunc): Ditto but for oct trunc.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
UNITS_PER_WORD"
This reverts commit 72ceddbfb78dbb95f0808c3eca1765e8cd48b023.
|
|
Add option -m(no-)autovec-segment to enable/disable autovectorizer
from emitting vector segment load/store instructions. This is useful for
performance experiments.
gcc/ChangeLog:
* config/riscv/autovec.md (vec_mask_len_load_lanes, vec_mask_len_store_lanes):
Predicate with TARGET_VECTOR_AUTOVEC_SEGMENT
* config/riscv/riscv-opts.h (TARGET_VECTOR_AUTOVEC_SEGMENT): New macro.
* config/riscv/riscv.opt (-m(no-)autovec-segment): New option.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-1.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-2.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-3.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-4.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-5.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-6.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg-7.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-1.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-2.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-3.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-4.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-5.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-6.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_load_noseg_run-7.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-1.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-2.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-3.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-4.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-5.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-6.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg-7.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-1.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-2.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-3.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-4.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-5.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-6.c:
New test.
* gcc.target/riscv/rvv/autovec/struct/mask_struct_store_noseg_run-7.c:
New test.
* gcc.target/riscv/rvv/autovec/no-segment.c: New test.
|
|
For fast unaligned access targets, by pieces uses up to UNITS_PER_WORD
size pieces resulting in more store instructions than needed. For
example gcc.target/riscv/rvv/base/setmem-1.c:f1 built with
`-O3 -march=rv64gcv -mtune=thead-c906`:
```
f1:
vsetivli zero,8,e8,mf2,ta,ma
vmv.v.x v1,a1
vsetivli zero,0,e32,mf2,ta,ma
sb a1,14(a0)
vmv.x.s a4,v1
vsetivli zero,8,e16,m1,ta,ma
vmv.x.s a5,v1
vse8.v v1,0(a0)
sw a4,8(a0)
sh a5,12(a0)
ret
```
The slow unaligned access version built with `-O3 -march=rv64gcv` used
15 sb instructions:
```
f1:
sb a1,0(a0)
sb a1,1(a0)
sb a1,2(a0)
sb a1,3(a0)
sb a1,4(a0)
sb a1,5(a0)
sb a1,6(a0)
sb a1,7(a0)
sb a1,8(a0)
sb a1,9(a0)
sb a1,10(a0)
sb a1,11(a0)
sb a1,12(a0)
sb a1,13(a0)
sb a1,14(a0)
ret
```
After this patch, the following is generated in both cases:
```
f1:
vsetivli zero,15,e8,m1,ta,ma
vmv.v.x v1,a1
vse8.v v1,0(a0)
ret
```
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_use_by_pieces_infrastructure_p):
New function.
(TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Define.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr113469.c: Expect mf2 setmem.
* gcc.target/riscv/rvv/base/setmem-2.c: Update f1 to expect
straight-line vector memset.
* gcc.target/riscv/rvv/base/setmem-3.c: Likewise.
|
|
[NFC]
This moves the code for deciding whether to generate a vectorized
memcpy, what vector mode to use and whether a loop is needed out of
riscv_vector::expand_block_move and into a new function
riscv_vector::use_stringop_p so that it can be reused for other string
operations.
gcc/ChangeLog:
* config/riscv/riscv-string.cc (struct stringop_info): New.
(expand_block_move): Move decision making code to...
(use_vector_stringop_p): ...here.
|
|
Unlike the other vector string ops, expand_block_move was using max LMUL
m8 regardless of TARGET_MAX_LMUL.
The check for whether to generate inline vector code for movmem has been
moved from movmem<mode> to riscv_vector::expand_block_move to avoid
maintaining multiple versions of similar logic. They already differed
on the minimum length for which they would generate vector code. Now
that the expand_block_move value is used, movmem will be generated for
smaller lengths.
Limiting memcpy to m1 caused some memcpy loops to be generated in
the calling convention tests which makes it awkward to add suitable scan
assembler tests checking the return value being set, so
-mrvv-max-lmul=m8 has been added to these tests. Other tests have been
adjusted to expect the new memcpy m1 generation where reasonably
straight-forward, otherwise -mrvv-max-lmul=m8 has been added.
pr111720-[0-9].c regressed because a memcpy loop is generated instead
of straight-line. This reveals an existing issue where a redundant
straight-line memcpy gets eliminated but a memcpy loop does not
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117205).
For example, on pr111720-0.c after this patch:
-mrvv-max-lmul=m8:
test:
lui a5,%hi(.LANCHOR0)
li a4,32
addi sp,sp,-32
addi a5,a5,%lo(.LANCHOR0)
vsetvli zero,a4,e8,m1,ta,ma
vle8.v v8,0(a5)
addi sp,sp,32
jr ra
-mrvv-max-lmul=m1:
test:
addi sp,sp,-32
lui a5,%hi(.LANCHOR0)
addi a5,a5,%lo(.LANCHOR0)
mv a2,sp
li a3,32
.L2:
vsetvli a4,a3,e8,m1,ta,ma
vle8.v v8,0(a5)
sub a3,a3,a4
add a5,a5,a4
vse8.v v8,0(a2)
add a2,a2,a4
bne a3,zero,.L2
li a5,32
vsetvli zero,a5,e8,m1,ta,ma
vle8.v v8,0(sp)
addi sp,sp,32
jr ra
I have added -mrvv-max-lmul=m8 to pr111720-[0-9].c so that we continue
to test the elimination of straight-line memcpy.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (get_lmul_mode): New prototype.
(expand_block_move): Add bool parameter for movmem_p.
* config/riscv/riscv-string.cc (riscv_expand_block_move_scalar):
Pass movmem_p as false to riscv_vector::expand_block_move.
(expand_block_move): Add movmem_p parameter. Return false if
loop needed and movmem_p is true. Respect TARGET_MAX_LMUL.
* config/riscv/riscv-v.cc (get_lmul_mode): New function.
* config/riscv/riscv.md (movmem<mode>): Move checking for
whether to generate inline vector code to
riscv_vector::expand_block_move by passing movmem_p as true.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr113206-1.c: Add
-mrvv-max-lmul=m8.
* gcc.target/riscv/rvv/autovec/pr113206-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: Add
-mrvv-max-lmul=m8 and adjust assembly scans.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c:
Likewise.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c:
Likewise.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c:
Likewise.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-5.c:
Likewise.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-6.c:
Likewise.
* gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c:
Likewise.
* gcc.target/riscv/rvv/autovec/vls/spill-4.c: Add
-mrvv-max-lmul=m8.
* gcc.target/riscv/rvv/autovec/vls/spill-7.c: Likewise.
* gcc.target/riscv/rvv/base/cpymem-1.c: Expect m1 in f1 and f2.
* gcc.target/riscv/rvv/base/cpymem-2.c: Add -mrvv-max-lmul=m8.
* gcc.target/riscv/rvv/base/movmem-1.c: Adjust f1 to a length
that will not get vectorized.
* gcc.target/riscv/rvv/base/pr111720-0.c: Add -mrvv-max-lmul=m8.
* gcc.target/riscv/rvv/base/pr111720-1.c: Likewise.
* gcc.target/riscv/rvv/base/pr111720-2.c: Likewise.
* gcc.target/riscv/rvv/base/pr111720-3.c: Likewise.
* gcc.target/riscv/rvv/base/pr111720-4.c: Likewise.
* gcc.target/riscv/rvv/base/pr111720-5.c: Likewise.
* gcc.target/riscv/rvv/base/pr111720-6.c: Likewise.
* gcc.target/riscv/rvv/base/pr111720-7.c: Likewise.
* gcc.target/riscv/rvv/base/pr111720-8.c: Likewise.
* gcc.target/riscv/rvv/base/pr111720-9.c: Likewise.
* gcc.target/riscv/rvv/vsetvl/pr112929-1.c: Expect memcpy m1
loops.
* gcc.target/riscv/rvv/vsetvl/pr112988-1.c: Likewise.
|
|
If riscv_vector::expand_block_move is generating a straight-line memcpy
using a predicated store, it tries to use a smaller LMUL to reduce
register pressure if it still allows an entire transfer.
This happens in the inner loop of riscv_vector::expand_block_move,
however, the vmode chosen by this loop gets overwritten later in the
function, so I have added the missing break from the outer loop.
I have also addressed a couple of issues with the conditions of the if
statement within the inner loop.
The first condition did not make sense to me:
```
TARGET_MIN_VLEN * lmul <= nunits * BITS_PER_UNIT
```
I think this was supposed to be checking that the length fits within the
given LMUL, so I have changed it to do that.
The second condition:
```
/* Avoid loosing the option of using vsetivli . */
&& (nunits <= 31 * lmul || nunits > 31 * 8)
```
seems to imply that lmul affects the range of AVL immediate that
vsetivli can take but I don't think that is correct. Anyway, I don't
think this condition is necessary because if we find a suitable mode we
should stick with it, regardless of whether it allowed vsetivli, rather
than continuing to try larger lmul which would increase register
pressure or smaller potential_ew which would increase AVL. I have
removed this condition.
gcc/ChangeLog:
* config/riscv/riscv-string.cc (expand_block_move): Fix
condition for using smaller LMUL. Break outer loop if a
suitable vmode has been found.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/pr112929-1.c: Expect smaller lmul.
* gcc.target/riscv/rvv/vsetvl/pr112988-1.c: Likewise.
* gcc.target/riscv/rvv/base/cpymem-3.c: New test.
|
|
gcc/ChangeLog:
* config/riscv/riscv-string.cc (expand_block_move): Replace
`end` with `length_rtx` in gen_rtx_NE.
|
|
gcc/ChangeLog:
* config/riscv/riscv-string.cc (expand_block_move): Fix
indentation.
|
|
While working on PR117028 C2Y changes, I've noticed weird ternary
operator formatting (operand1 ? operand2: operand3).
The usual formatting is operand1 ? operand2 : operand3
where we have around 18000+ cases of that (counting only what fits
on one line) and
indent -nbad -bap -nbc -bbo -bl -bli2 -bls -ncdb -nce -cp1 -cs -di2 -ndj \
-nfc1 -nfca -hnl -i2 -ip5 -lp -pcs -psl -nsc -nsob
documented in
https://www.gnu.org/prep/standards/html_node/Formatting.html#Formatting
does the same.
Some code was even trying to save space as much as possible and used
operand1?operand2:operand3 or
operand1 ? operand2:operand3
Today I've grepped for such cases (the grep was '?.*[^ ]:' and I had to
skim through various false positives with that where the : matched e.g.
stuff inside of strings, or *.md pattern macros or :: scope) and the
following patch is a fix for what I found.
2024-10-16 Jakub Jelinek <jakub@redhat.com>
gcc/
* attribs.cc (lookup_scoped_attribute_spec): ?: operator formatting
fixes.
* basic-block.h (FOR_BB_INSNS_SAFE): Likewise.
* cfgcleanup.cc (outgoing_edges_match): Likewise.
* cgraph.cc (cgraph_node::dump): Likewise.
* config/arc/arc.cc (gen_acc1, gen_acc2): Likewise.
* config/arc/arc.h (CLASS_MAX_NREGS, CONSTANT_ADDRESS_P): Likewise.
* config/arm/arm.cc (arm_print_operand): Likewise.
* config/cris/cris.md (*b<rnzcond:code><mode>): Likewise.
* config/darwin.cc (darwin_asm_declare_object_name,
darwin_emit_common): Likewise.
* config/darwin-driver.cc (darwin_driver_init): Likewise.
* config/epiphany/epiphany.md (call, sibcall, call_value,
sibcall_value): Likewise.
* config/i386/i386.cc (gen_push2): Likewise.
* config/i386/i386.h (ix86_cur_cost): Likewise.
* config/i386/openbsdelf.h (FUNCTION_PROFILER): Likewise.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins):
Likewise.
* config/loongarch/loongarch-cpu.cc (fill_native_cpu_config):
Likewise.
* config/riscv/riscv.cc (riscv_union_memmodels): Likewise.
* config/riscv/zc.md (*mva01s<X:mode>, *mvsa01<X:mode>): Likewise.
* config/rs6000/mmintrin.h (_mm_cmpeq_pi8, _mm_cmpgt_pi8,
_mm_cmpeq_pi16, _mm_cmpgt_pi16, _mm_cmpeq_pi32, _mm_cmpgt_pi32):
Likewise.
* config/v850/predicates.md (pattern_is_ok_for_prologue): Likewise.
* config/xtensa/constraints.md (d, C, W): Likewise.
* coverage.cc (coverage_begin_function, build_init_ctor,
build_gcov_exit_decl): Likewise.
* df-problems.cc (df_create_unused_note): Likewise.
* diagnostic.cc (diagnostic_set_caret_max_width): Likewise.
* diagnostic-path.cc (path_summary::path_summary): Likewise.
* expr.cc (expand_expr_divmod): Likewise.
* gcov.cc (format_gcov): Likewise.
* gcov-dump.cc (dump_gcov_file): Likewise.
* genmatch.cc (main): Likewise.
* incpath.cc (remove_duplicates, register_include_chains): Likewise.
* ipa-devirt.cc (dump_odr_type): Likewise.
* ipa-icf.cc (sem_item_optimizer::merge_classes): Likewise.
* ipa-inline.cc (inline_small_functions): Likewise.
* ipa-polymorphic-call.cc (ipa_polymorphic_call_context::dump):
Likewise.
* ipa-sra.cc (create_parameter_descriptors): Likewise.
* ipa-utils.cc (find_always_executed_bbs): Likewise.
* predict.cc (predict_loops): Likewise.
* selftest.cc (read_file): Likewise.
* sreal.h (SREAL_SIGN, SREAL_ABS): Likewise.
* tree-dump.cc (dequeue_and_dump): Likewise.
* tree-ssa-ccp.cc (bit_value_binop): Likewise.
gcc/c-family/
* c-opts.cc (c_common_init_options, c_common_handle_option,
c_common_finish, set_std_c89, set_std_c99, set_std_c11,
set_std_c17, set_std_c23, set_std_cxx98, set_std_cxx11,
set_std_cxx14, set_std_cxx17, set_std_cxx20, set_std_cxx23,
set_std_cxx26): ?: operator formatting fixes.
gcc/cp/
* search.cc (lookup_member): ?: operator formatting fixes.
* typeck.cc (cp_build_modify_expr): Likewise.
libcpp/
* expr.cc (interpret_float_suffix): ?: operator formatting fixes.
|
|
In compute_nregs_for_mode we expect that the current variable's mode is
at most as large as the biggest mode to be used for vectorization.
This might not be true for constants as they don't actually have a mode.
In that case, just use the biggest mode so max_number_of_live_regs
returns 1.
This fixes several test cases in the test suite.
gcc/ChangeLog:
PR target/116655
* config/riscv/riscv-vector-costs.cc (max_number_of_live_regs):
Use biggest mode instead of constant's saved mode.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr116655.c: New test.
|
|
This is a minor patch from Jivan from roughly a year ago. The basic
idea here is similar to what we do when extending values for the sake of
comparisons. Specifically if the value is already known to be properly
extended, then an extension is just a copy.
The original idea was to use a similar patch, but which aborted to
identify cases where these unnecessary promotions where emitted. All
that showed up when doing a testsuite run with that abort was the
promotions created by the arithmetic with overflow patterns such as addv.
Things like addv aren't *that* common so this never got high on my todo
list, even after a minor issue in this space was raised in bugzilla.
But with stage1 closing soon and no good reason not to go forward, I'm
submitting this into the pre-commit tester now. My tester has been
using it since roughly Feb :-) Plan would be to commit after the
pre-commit tester renders its verdict.
* config/riscv/riscv.md (zero_extendsidi2): If RHS is already
zero extended, then this is just a copy.
(extendsidi2): Similarly, but for sign extension.
|
|
I probably spent way more time on this than it's worth...
I was looking at the code we generate for vector SAD and noticed that we were
being a bit silly. Specifically:
li a4,0 # 272 [c=4 l=4] *movsi_internal/1
Followed shortly by:
vmv.s.x v3,a4 # 261 [c=4 l=4] *pred_broadcastrvvm1si/6
And no other uses of a4. We could have used x0 trivially.
First we adjust the expander so that it doesn't force the constant into a
register. In the matching pattern we change the appropriate source constraints
from "r" to "rJ" and the output template is changed to use %z for the operand.
The net is we drop the li completely and emit vmv.s.x,v3,x0.
But wait, there's more. If we're broadcasting a constant in the range
[-16..15] into a vector, we currently load the constant into a register and use
vmv.v.r. We can instead use vmv.v.i, which avoids loading the constant into a
GPR. For that case we again avoid forcing the constant into a register in the
expander and adjust the output template to emit vmv.v.x or vmv.v.i based on
whether or not the appropriate operand is a constant or general purpose
register. So again, we'll drop a load immediate into a scalar for this case.
Whether or not we should use vmv.v.i vs vmv.s.x for loading [-16..15] into the
0th element is probably uarch dependent. The tradeoff is loading the GPR vs
the broadcast in the vector unit. I didn't bother with this case.
Tested in my tester (which tests rv64gcv as a default codegen option). Will
wait for the pre-commit tester to render a verdict.
gcc/
* config/riscv/constraints.md (P): New constraint.
* config/riscv/vector.md (pred_broadcast<mode> expander): Do
not force small integers into GPRs so aggressively.
(pred_broadcast<mode> insn & splitter): Allow splatting small
constants across the vector register directly. Allow splatting
(const_int 0) into element 0 directly.
|
|
This patch would like to implement the sssub for vector signed integer.
Form 1:
#define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX) \
void __attribute__((noinline)) \
vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
{ \
unsigned i; \
for (i = 0; i < limit; i++) \
{ \
T x = op_1[i]; \
T y = op_2[i]; \
T minus = (UT)x - (UT)y; \
out[i] = (x ^ y) >= 0 \
? minus \
: (minus ^ x) >= 0 \
? minus \
: x < 0 ? MIN : MAX; \
} \
}
DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX)
Before this patch:
28 │ vle8.v v1,0(a1)
29 │ vle8.v v2,0(a2)
30 │ sub a3,a3,a5
31 │ add a1,a1,a5
32 │ add a2,a2,a5
33 │ vsra.vi v4,v1,7
34 │ vsub.vv v3,v1,v2
35 │ vxor.vv v2,v1,v2
36 │ vxor.vv v0,v1,v3
37 │ vmslt.vi v2,v2,0
38 │ vmslt.vi v0,v0,0
39 │ vmand.mm v0,v0,v2
40 │ vxor.vv v3,v4,v5,v0.t
41 │ vse8.v v3,0(a0)
42 │ add a0,a0,a5
After this patch:
25 │ vle8.v v1,0(a1)
26 │ vle8.v v2,0(a2)
27 │ sub a3,a3,a5
28 │ add a1,a1,a5
29 │ add a2,a2,a5
30 │ vssub.vv v1,v1,v2
31 │ vse8.v v1,0(a0)
32 │ add a0,a0,a5
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/autovec.md (sssub<mode>3): Add new pattern for
signed SAT_SUB.
* config/riscv/riscv-protos.h (expand_vec_sssub): Add new func
decl to expand sssub to vssub.
* config/riscv/riscv-v.cc (expand_vec_sssub): Add new func
impl to expand sssub to vssub.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
From: xuli <xuli1@eswincomputing.com>
Example as follows:
int main()
{
unsigned long arraya[128], arrayb[128], arrayc[128];
for (int i = 0; i < 128; i++)
{
arraya[i] = arrayb[i] + arrayc[i];
}
return 0;
}
Compiled with -march=rv32imafc_zve32f -mabi=ilp32f, it will cause a compilation issue:
riscv_vector.h:40:25: error: ambiguating new declaration of 'vint64m4_t __riscv_vle64(vbool16_t, const long long int*, unsigned int)'
40 | #pragma riscv intrinsic "vector"
| ^~~~~~~~
riscv_vector.h:40:25: note: old declaration 'vint64m1_t __riscv_vle64(vbool64_t, const long long int*, unsigned int)'
With zvl=32b, vbool16_t is registered in init_builtins() with
type_common.precision=0x101 (nunits=2), mode_nunits[E_RVVMF16BI]=[2,2].
Normally, vbool64_t is only valid when TARGET_MIN_VLEN > 32, so vbool64_t
is not registered in init_builtins(), meaning vbool64_t=null.
In order to implement __attribute__((target("arch=+v"))), we must register
all vector types and all RVV intrinsics. Therefore, vbool64_t will be registered
by default with zvl=128b in reinit_builtins(), resulting in
type_common.precision=0x101 (nunits=2) and mode_nunits[E_RVVMF64BI]=[2,2].
We then get TYPE_VECTOR_SUBPARTS(vbool16_t) == TYPE_VECTOR_SUBPARTS(vbool64_t),
calculated using type_common.precision, resulting in 2. Since vbool16_t and
vbool64_t have the same element type (boolean_type), the compiler treats them
as the same type, leading to a re-declaration conflict.
After all types and intrinsics have been registered, processing
__attribute__((target("arch=+v"))) will update the parameters option and
init_adjust_machine_modes. Therefore, to avoid conflicts, we can choose
zvl=4096b for the null type reinit_builtins().
command option zvl=32b
type nunits
vbool64_t => null
vbool32_t=> [1,1]
vbool16_t=> [2,2]
vbool8_t=> [4,4]
vbool4_t=> [8,8]
vbool2_t=> [16,16]
vbool1_t=> [32,32]
reinit zvl=128b
vbool64_t => [2,2] conflict with zvl32b vbool16_t=> [2,2]
reinit zvl=256b
vbool64_t => [4,4] conflict with zvl32b vbool8_t=> [4,4]
reinit zvl=512b
vbool64_t => [8,8] conflict with zvl32b vbool4_t=> [8,8]
reinit zvl=1024b
vbool64_t => [16,16] conflict with zvl32b vbool2_t=> [16,16]
reinit zvl=2048b
vbool64_t => [32,32] conflict with zvl32b vbool1_t=> [32,32]
reinit zvl=4096b
vbool64_t => [64,64] zvl=4096b is ok
Signed-off-by: xuli <xuli1@eswincomputing.com>
PR target/116883
gcc/ChangeLog:
* config/riscv/riscv-c.cc (riscv_pragma_intrinsic_flags_pollute): Choose zvl4096b
to initialize null type.
gcc/testsuite/ChangeLog:
* g++.target/riscv/rvv/base/pr116883.C: New test.
|
|
After the valuable feedback I received, it’s clear to me that the
oversight was in the tests showing the benefits of the patch. In the
test file, I added functions f5 and f6, which now generate more
efficient code with fewer instructions.
Before the patch:
f5:
li a4,2097152
addi a4,a4,-2048
li a5,1167360
and a0,a0,a4
addi a5,a5,-2048
beq a0,a5,.L4
f6:
li a5,3407872
addi a5,a5,-2048
and a0,a0,a5
li a5,1114112
beq a0,a5,.L7
After the patch:
f5:
srli a5,a0,11
andi a5,a5,1023
li a4,569
beq a5,a4,.L5
f6:
srli a5,a0,11
andi a5,a5,1663
li a4,544
beq a5,a4,.L9
PR target/115921
gcc/ChangeLog:
* config/riscv/iterators.md (any_eq): New code iterator.
* config/riscv/riscv.h (COMMON_TRAILING_ZEROS): New macro.
(SMALL_AFTER_COMMON_TRAILING_SHIFT): Ditto.
* config/riscv/riscv.md (*branch<ANYI:mode>_shiftedarith_<optab>_shifted):
New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/branch-1.c: Additional tests.
|