Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds else operands to masked loads. Currently the default
else operand predicate just accepts "undefined" (i.e. SCRATCH) values.
PR middle-end/115336
PR middle-end/116059
gcc/ChangeLog:
* config/riscv/autovec.md: Add else operand.
* config/riscv/predicates.md (maskload_else_operand): New
predicate.
* config/riscv/riscv-v.cc (get_else_operand): Remove static.
(expand_load_store): Use get_else_operand and adjust index.
(expand_gather_scatter): Ditto.
(expand_lanes_load_store): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr115336.c: New test.
* gcc.target/riscv/rvv/autovec/pr116059.c: New test.
|
|
Currently, the RISC-V target uses the target specific mplt option to
control PLT generation. This patch deprecates the target specific mplt
option and uses the common fplt option instead. This allows users to
use the same option for most targets.
Co-Developed-by: Liao Shihua <shihua@iscas.ac.cn>
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
gcc/ChangeLog:
* config/riscv/predicates.md: Use flag_plt instead of TARGET_PLT.
* config/riscv/riscv.opt: alias common option fplt to mplt.
|
|
We can unify eqne and other comparison operations.
Tested on RV32 and RV64.
gcc/ChangeLog:
* config/riscv/predicates.md (comparison_except_eqge_operator): Only
exclude ge.
(comparison_except_ge_operator): Ditto.
* config/riscv/riscv-string.cc (expand_rawmemchr): Use cmp pattern.
(expand_strcmp): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc: Remove eqne cond.
* config/riscv/vector.md (@pred_eqne<mode>_scalar): Remove eqne
patterns.
(*pred_eqne<mode>_scalar_merge_tie_mask): Ditto.
(*pred_eqne<mode>_scalar): Ditto.
(*pred_eqne<mode>_scalar_narrow): Ditto.
(*pred_eqne<mode>_extended_scalar_merge_tie_mask): Ditto.
(*pred_eqne<mode>_extended_scalar): Ditto.
(*pred_eqne<mode>_extended_scalar_narrow): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/integer-cmp-eqne.c: New test.
|
|
Testing with Zbs enabled by default showed a minor logic error. After
the loop clearing things with bclri, we can only use the sequence if we
were able to clear all the necessary bits. If any bits are still on,
then the bclr sequence turned out to not be profitable.
--
So this is conceptually similar to how we handled direct generation of
bseti for constant synthesis, but this time for bclr.
In the bclr case, we already have an expander for AND. So we just
needed to adjust the predicate to accept another class of constant
operands (those with a single bit clear).
With that in place constant synthesis is adjusted so that it counts the
number of bits clear in the high 33 bits of a 64bit word. If that
number is small relative to the current best cost, then we try to
generate the constant with a lui based sequence for the low half which
implicitly sets the upper 32 bits as well. Then we bclri one or more of
those upper 33 bits.
So as an example, this code goes from 4 instructions down to 3:
> unsigned long foo_0xfffffffbfffff7ff(void) { return
0xfffffffbfffff7ffUL; }
Note the use of 33 bits above. That's meant to capture cases like this:
> unsigned long foo_0xfffdffff7ffff7ff(void) { return
0xfffdffff7ffff7ffUL; }
We can use lui+addi+bclri+bclri to synthesize that in 4 instructions
instead of 5.
I'm including a handful of cases covering the two basic ideas above that
were found by the testing code.
And, no, we're not done yet. I see at least one more notable idiom
missing before exploring zbkb's potential to improve things.
Tested in my tester and waiting on Rivos CI system before moving forward.
gcc/
* config/riscv/predicates.md (arith_operand_or_mode_mask): Renamed to..
(arith_or_mode_mask_or_zbs_operand): New predicate.
* config/riscv/riscv.md (and<mode>3): Update predicate for operand 2.
* config/riscv/riscv.cc (riscv_build_integer_1): Use bclri to clear
bits, particularly bits 31..63 when profitable to do so.
gcc/testsuite/
* gcc.target/riscv/synthesis-6.c: New test.
|
|
... if the constant can be represented as sum of two S12 values.
The two S12 values could instead be fused with subsequent ADD insn.
The helps
- avoid an additional LUI insn
- side benefits of not clobbering a reg
e.g.
w/o patch w/ patch
long | |
plus(unsigned long i) | li a5,4096 |
{ | addi a5,a5,-2032 | addi a0, a0, 2047
return i + 2064; | add a0,a0,a5 | addi a0, a0, 17
} | ret | ret
NOTE: In theory not having const in a standalone reg might seem less
CSE friendly, but for workloads in consideration these mat are
from very late LRA reloads and follow on GCSE is not doing much
currently.
The real benefit however is seen in base+offset computation for array
accesses and especially for stack accesses which are finalized late in
optim pipeline, during LRA register allocation. Often the finalized
offsets trigger LRA reloads resulting in mind boggling repetition of
exact same insn sequence including LUI based constant materialization.
This shaves off 290 billion dynamic instrustions (QEMU icounts) in
SPEC 2017 Cactu benchmark which is over 10% of workload. In the rest of
suite, there additional 10 billion shaved, with both gains and losses
in indiv workloads as is usual with compiler changes.
500.perlbench_r-0 | 1,214,534,029,025 | 1,212,887,959,387 |
500.perlbench_r-1 | 740,383,419,739 | 739,280,308,163 |
500.perlbench_r-2 | 692,074,638,817 | 691,118,734,547 |
502.gcc_r-0 | 190,820,141,435 | 190,857,065,988 |
502.gcc_r-1 | 225,747,660,839 | 225,809,444,357 | <- -0.02%
502.gcc_r-2 | 220,370,089,641 | 220,406,367,876 | <- -0.03%
502.gcc_r-3 | 179,111,460,458 | 179,135,609,723 | <- -0.02%
502.gcc_r-4 | 219,301,546,340 | 219,320,416,956 | <- -0.01%
503.bwaves_r-0 | 278,733,324,691 | 278,733,323,575 | <- -0.01%
503.bwaves_r-1 | 442,397,521,282 | 442,397,519,616 |
503.bwaves_r-2 | 344,112,218,206 | 344,112,216,760 |
503.bwaves_r-3 | 417,561,469,153 | 417,561,467,597 |
505.mcf_r | 669,319,257,525 | 669,318,763,084 |
507.cactuBSSN_r | 2,852,767,394,456 | 2,564,736,063,742 | <+ 10.10%
508.namd_r | 1,855,884,342,110 | 1,855,881,110,934 |
510.parest_r | 1,654,525,521,053 | 1,654,402,859,174 |
511.povray_r | 2,990,146,655,619 | 2,990,060,324,589 |
519.lbm_r | 1,158,337,294,525 | 1,158,337,294,529 |
520.omnetpp_r | 1,021,765,791,283 | 1,026,165,661,394 |
521.wrf_r | 1,715,955,652,503 | 1,714,352,737,385 |
523.xalancbmk_r | 849,846,008,075 | 849,836,851,752 |
525.x264_r-0 | 277,801,762,763 | 277,488,776,427 |
525.x264_r-1 | 927,281,789,540 | 926,751,516,742 |
525.x264_r-2 | 915,352,631,375 | 914,667,785,953 |
526.blender_r | 1,652,839,180,887 | 1,653,260,825,512 |
527.cam4_r | 1,487,053,494,925 | 1,484,526,670,770 |
531.deepsjeng_r | 1,641,969,526,837 | 1,642,126,598,866 |
538.imagick_r | 2,098,016,546,691 | 2,097,997,929,125 |
541.leela_r | 1,983,557,323,877 | 1,983,531,314,526 |
544.nab_r | 1,516,061,611,233 | 1,516,061,407,715 |
548.exchange2_r | 2,072,594,330,215 | 2,072,591,648,318 |
549.fotonik3d_r | 1,001,499,307,366 | 1,001,478,944,189 |
554.roms_r | 1,028,799,739,111 | 1,028,780,904,061 |
557.xz_r-0 | 363,827,039,684 | 363,057,014,260 |
557.xz_r-1 | 906,649,112,601 | 905,928,888,732 |
557.xz_r-2 | 509,023,898,187 | 508,140,356,932 |
997.specrand_fr | 402,535,577 | 403,052,561 |
999.specrand_ir | 402,535,577 | 403,052,561 |
This should still be considered damage control as the real/deeper fix
would be to reduce number of LRA reloads or CSE/anchor those during
LRA constraint sub-pass (re)runs (thats a different PR/114729.
Implementation Details (for posterity)
--------------------------------------
- basic idea is to have a splitter selected via a new predicate for constant
being possible sum of two S12 and provide the transform.
This is however a 2 -> 2 transform which combine can't handle.
So we specify it using a define_insn_and_split.
- the initial loose "i" constraint caused LRA to accept invalid insns thus
needing a tighter new constraint as well.
- An additional fallback alternate with catch-all "r" register
constraint also needed to allow any "reloads" that LRA might
require for ADDI with const larger than S12.
Testing
--------
This is testsuite clean (rv64 only).
I'll rely on post-commit CI multlib run for any possible fallout for
other setups such as rv32.
| | gcc | g++ | gfortran |
| rv64imafdc_zba_zbb_zbs_zicond/ lp64d/ medlow | 41 / 17 | 8 / 3 | 7 / 2 |
| rv64imafdc_zba_zbb_zbs_zicond/ lp64d/ medlow | 41 / 17 | 8 / 3 | 7 / 2 |
I also threw this into a buildroot run, it obviously boots Linux to
userspace. bloat-o-meter on glibc and kernel show overall decrease in
staic instruction counts with some minor spot increases.
These are generally in the category of
- LUI + ADDI are 2 byte each vs. two ADD being 4 byte each.
- Knock on effects due to inlining changes.
- Sometimes the slightly shorter 2-insn seq in a mult-exit function
can cause in-place epilogue duplication (vs. a jump back).
This is slightly larger but more efficient in execution.
In summary nothing to fret about.
| linux/scripts/bloat-o-meter build-gcc-240131/target/lib/libc.so.6 \
build-gcc-240131-new-splitter-1-variant/target/lib/libc.so.6
|
| add/remove: 0/0 grow/shrink: 21/49 up/down: 520/-3056 (-2536)
| Function old new delta
| getnameinfo 2756 2892 +136
...
| tempnam 136 144 +8
| padzero 276 284 +8
...
| __GI___register_printf_specifier 284 280 -4
| __EI_xdr_array 468 464 -4
| try_file_lock 268 260 -8
| pthread_create@GLIBC_2 3520 3508 -12
| __pthread_create_2_1 3520 3508 -12
...
| _nss_files_setnetgrent 932 904 -28
| _nss_dns_gethostbyaddr2_r 1524 1480 -44
| build_trtable 3312 3208 -104
| printf_positional 25000 22580 -2420
| Total: Before=2107999, After=2105463, chg -0.12%
Caveat:
------
Jeff noted during v2 review that the operand0 constraint !riscv_reg_frame_related
could potentially cause issues with hard reg cprop in future. If that
trips things up we will have to loosen the constraint while dialing down
the const range to (-2048 to 2032) as opposed to fll S12 range of
(-2048 to 2047) to keep stack regs aligned.
gcc/ChangeLog:
* config/riscv/riscv.h: New macros to check for sum of two S12
range.
* config/riscv/constraints.md: New constraint.
* config/riscv/predicates.md: New Predicate.
* config/riscv/riscv.md: New splitter.
* config/riscv/riscv.cc (riscv_reg_frame_related): New helper.
* config/riscv/riscv-protos.h: New helper prototype.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/sum-of-two-s12-const-1.c: New test: checks
for new patterns output.
* gcc.target/riscv/sum-of-two-s12-const-2.c: Ditto.
* gcc.target/riscv/sum-of-two-s12-const-3.c: New test: should not
ICE.
Tested-by: Edwin Lu <ewlu@rivosinc.com> # pre-commit-CI #1520
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
In doing some preparation work for using zbkb's pack instructions for constant
synthesis I figured it would be wise to get a sense of how well our constant
synthesis is actually working and address any clear issues.
So the first glaring inefficiency is in our handling of constants with a small
number of bits set. Let's start with just two bits set. There are 2016
distinct constants in that space (rv64). With Zbs enabled the absolute worst
we should ever do is two instructions (bseti+bseti). Yet we have 503 cases
where we're generating 3+ instructions when there's just two bits set in the
constant. A constant like 0x8000000000001000 generates 4 instructions!
This patch adds bseti (and indirectly binvi if we needed it) as a first class
citizen for constant synthesis. There's two components to this change.
First, we can't generate an IOR with a constant like (1 << 45) as an operand.
The IOR/XOR define_insn is in riscv.md. The constant argument for those
patterns must match an arith_operand which means its not really usable for
generating bseti directly in the cases we care about (at least one of the bits
will be in the 32..63 range and thus won't match arith_operand).
We have a few things we could do. One would be to extend the existing pattern
to incorporate bseti cases. But I suspect folks like the separation of the
base architecture (riscv.md) from the Zb* extensions (bitmanip.md). We could
also try to generate the RTL for bseti
directly, bypassing gen_fmt_ee (which forces undesirable constants into registers based on the predicate of the appropriate define_insn). Neither of these seemed particularly appealing to me.
So what I've done instead is to make ior/xor a define_expand and have the
expander allow a wider set of constant operands when Zbs is enabled. That
allows us to keep the bulk of Zb* support inside bitmanip.md and continue to
use gen_fmt_ee in the constant synthesis paths.
Note the code generation in this case is designed to first set as many bits as
we can with lui, then with addi since those can both set multiple bits at a
time. If there are any residual bits left to set we can emit bseti
instructions up to the current cost ceiling.
This results in fixing all of the 503 2-bit set cases where we emitted too many
instructions. It also significantly helps other scenarios with more bits set.
The testcase I'm including verifies the number of instructions we generate for
the full set of 2016 possible cases. Obviously this won't be possible as we
increase the number of bits (there are something like 48k cases with just 3
bits set).
gcc/
* config/riscv/predicates.md (arith_or_zbs_operand): New predicate.
* config/riscv/riscv.cc (riscv_build_integer_one): Use bseti to set
single bits when profitable.
* config/riscv/riscv.md (*<optab><mode>3): Renamed with '*' prefix.
(<optab><mode>3): New expander for IOR/XOR.
gcc/testsuite
* gcc.target/riscv/synthesis-1.c: New test.
|
|
Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md
Contributors:
Mary Bennett <mary.bennett@embecosm.com>
Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Pietra Ferreira <pietra.ferreira@embecosm.com>
Charlie Keaney
Jessica Mills
Craig Blackmore <craig.blackmore@embecosm.com>
Simon Cook <simon.cook@embecosm.com>
Jeremy Bennett <jeremy.bennett@embecosm.com>
Helene Chelin <helene.chelin@embecosm.com>
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Create XCVbi extension
support.
* config/riscv/riscv.opt: Likewise.
* config/riscv/corev.md: Implement cv_branch<mode> pattern
for cv.beqimm and cv.bneimm.
* config/riscv/riscv.md: Add CORE-V branch immediate to RISC-V
branch instruction pattern.
* config/riscv/constraints.md: Implement constraints
cv_bi_s5 - signed 5-bit immediate.
* config/riscv/predicates.md: Implement predicate
const_int5s_operand - signed 5 bit immediate.
* doc/sourcebuild.texi: Add XCVbi documentation.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-bi-beqimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-beqimm-compile-2.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-2.c: New test.
* lib/target-supports.exp: Add proc for XCVbi.
|
|
Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md
Contributors:
Mary Bennett <mary.bennett@embecosm.com>
Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Pietra Ferreira <pietra.ferreira@embecosm.com>
Charlie Keaney
Jessica Mills
Craig Blackmore <craig.blackmore@embecosm.com>
Simon Cook <simon.cook@embecosm.com>
Jeremy Bennett <jeremy.bennett@embecosm.com>
Helene Chelin <helene.chelin@embecosm.com>
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add XCVbitmanip.
* config/riscv/constraints.md: Likewise.
* config/riscv/corev.def: Likewise.
* config/riscv/corev.md: Likewise.
* config/riscv/predicates.md: Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
* config/riscv/riscv-ftypes.def: Likewise.
* config/riscv/riscv.opt: Likewise.
* config/riscv/riscv.cc (riscv_print_operand): Add new operand 'Y'.
* doc/extend.texi: Add XCVbitmanip builtin documentation.
* doc/sourcebuild.texi: Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-simd-abs-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-abs-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-and-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-and-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-and-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-and-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avg-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avg-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avg-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avg-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avgu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avgu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avgu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avgu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpeq-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpeq-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpeq-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpeq-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpge-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpge-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpge-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpge-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgeu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgeu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgeu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgeu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgt-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgt-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgt-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgt-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgtu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgtu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgtu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgtu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmple-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmple-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmple-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmple-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpleu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpleu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpleu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpleu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmplt-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmplt-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmplt-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmplt-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpltu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpltu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpltu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpltu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpne-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpne-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpne-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpne-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxconj-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-i-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-i-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-i-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-i-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-r-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-r-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-r-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-r-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotsp-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotsp-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotsp-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotsp-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotup-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotup-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotup-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotup-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotusp-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotusp-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotusp-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotusp-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-extract-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-extract-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-extractu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-extractu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-insert-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-insert-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-march-compile-1.c: New test.
* gcc.target/riscv/cv-simd-max-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-max-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-max-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-max-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-maxu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-maxu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-maxu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-maxu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-min-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-min-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-min-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-min-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-minu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-minu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-minu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-minu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-neg-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-neg-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-or-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-or-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-or-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-or-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-pack-compile-1.c: New test.
* gcc.target/riscv/cv-simd-pack-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-packhi-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-packlo-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotsp-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotsp-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotsp-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotsp-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotup-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotup-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotup-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotup-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotusp-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotusp-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotusp-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotusp-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shuffle-sci-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shuffle2-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shuffle2-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shufflei0-sci-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shufflei1-sci-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shufflei2-sci-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shufflei3-sci-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sll-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sll-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sll-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sll-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sra-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sra-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sra-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sra-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-srl-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-srl-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-srl-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-srl-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-subrotmj-compile-1.c: New test.
* gcc.target/riscv/cv-simd-subrotmj-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-subrotmj-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-subrotmj-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-xor-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-xor-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-xor-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-xor-sc-h-compile-1.c: New test.
* lib/target-supports.exp: Add proc for XCVsimd extension.
|
|
This patch is to handle the differences in instruction generation
between Vector and XTheadVector. In this version, we only support
partial xtheadvector instructions that leverage directly from current
RVV1.0 with simple adding "th." prefix. For different name xtheadvector
instructions but share same patterns as RVV1.0 instructions, we will
use ASM targethook to rewrite the whole string of the instructions in
the following patches.
For some vector patterns that cannot be avoided, we use
"!TARGET_XTHEADVECTOR" to disable them in vector.md in order
not to generate instructions that xtheadvector does not support,
like vmv1r.
gcc/ChangeLog:
* config.gcc: Add files for XTheadVector intrinsics.
* config/riscv/autovec.md: Guard XTheadVector.
* config/riscv/predicates.md: Disable immediate vl
for XTheadVector.
* config/riscv/riscv-c.cc (riscv_pragma_intrinsic):
Add pragma for XTheadVector.
* config/riscv/riscv-string.cc (riscv_expand_block_move):
Guard XTheadVector.
* config/riscv/riscv-v.cc (vls_mode_valid_p):
Avoid autovec.
* config/riscv/riscv-vector-builtins-bases.cc:
Do not normalize vsetvl instructions for XTheadVector.
* config/riscv/riscv-vector-builtins-shapes.cc (check_type):
New check type function.
(build_one): Adjust for XTheadVector.
* config/riscv/riscv-vector-switch.def (ENTRY):
Disable fractional mode for the XTheadVector extension.
(TUPLE_ENTRY): Likewise.
* config/riscv/riscv.cc (riscv_v_adjust_bytesize):
Guard XTheadVector.
(riscv_preferred_simd_mode): Likewsie.
(riscv_autovectorize_vector_modes): Likewise.
(riscv_vector_mode_supported_any_target_p): Likewise.
(TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P): Likewise.
* config/riscv/thead.cc (th_asm_output_opcode):
Rewrite vsetvl instructions.
* config/riscv/vector.md:
Include thead-vector.md and change fractional LMUL
into 1 for vbool.
* config/riscv/riscv_th_vector.h: New file.
* config/riscv/thead-vector.md: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pragma-1.c: Add XTheadVector.
* gcc.target/riscv/rvv/base/abi-1.c: Exclude XTheadVector.
* lib/target-supports.exp: Add target for XTheadVector.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com>
Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com>
Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
|
|
gcc/ChangeLog:
* config/riscv/predicates.md (move_operand): Reject symbolic operands
with a type SYMBOL_FORCE_TO_MEM.
(call_insn_operand): Support for CM_Large.
(pcrel_symbol_operand): New.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add builtin_define
"__riscv_cmodel_large".
* config/riscv/riscv-opts.h (riscv_code_model): Add CM_LARGE.
* config/riscv/riscv-protos.h (riscv_symbol_type): Add
SYMBOL_FORCE_TO_MEM.
* config/riscv/riscv.cc (riscv_classify_symbol) Support CM_LARGE model.
(riscv_symbol_insns) Add SYMBOL_FORCE_TO_MEM.
(riscv_cannot_force_const_mem): Ditto.
(riscv_split_symbol): Ditto.
(riscv_force_address): Check pseudo reg available before force_reg.
(riscv_size_ok_for_small_data_p): Disable in CM_LARGE model.
(riscv_can_use_per_function_literal_pools_p): New.
(riscv_elf_select_rtx_section): Handle per-function literal pools.
(riscv_output_mi_thunk): Add riscv_in_thunk_func.
(riscv_option_override): Support CM_LARGE model.
(riscv_function_ok_for_sibcall): Disable sibcalls in CM_LARGE model.
(riscv_in_thunk_func): New static.
* config/riscv/riscv.md (unspec): Define UNSPEC_FORCE_FOR_MEM.
(*large_load_address): New.
* config/riscv/riscv.opt (code_model): New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/large-model.c: New test.
|
|
A handful of the scalar crypto instructions are supposed to take a
constant integer argument 0..3 inclusive and one should accept 0..10.
A suitable constraint was created and used for this purpose (D03 and DsA),
but the operand's predicate is "register_operand". That's just wrong.
This patch adds a new predicates "const_0_3_operand" and "const_0_10_operand"
and fixes the relevant insns to use the appropriate predicate. It drops the
now unnecessary constraints.
The testsuite was broken in a way that made it consistent with the
compiler, so the tests passed, when they really should have been issuing
errors all along.
This patch adjusts the existing tests so that they all expect a
diagnostic on the invalid operand usage (including out of range
constants). It adds new tests with proper constants, testing the
extremes of valid values.
PR target/110201
gcc/
* config/riscv/constraints.md (D03, DsA): Remove unused constraints.
* config/riscv/predicates.md (const_0_3_operand): New predicate.
(const_0_10_operand): Likewise.
* config/riscv/crypto.md (riscv_aes32dsi): Use new predicate. Drop
unnecessary constraint.
(riscv_aes32dsmi, riscv_aes64im, riscv_aes32esi): Likewise.
(riscv_aes32esmi, *riscv_<sm4_op>_si): Likewise.
(riscv_<sm4_op>_di_extend, riscv_<sm4_op>_si): Likewise.
gcc/testsuite
* gcc.target/riscv/zknd32.c: Verify diagnostics are issued for
invalid builtin arguments.
* gcc.target/riscv/zknd64.c: Likewise.
* gcc.target/riscv/zkne32.c: Likewise.
* gcc.target/riscv/zkne64.c: Likewise.
* gcc.target/riscv/zksed32.c: Likewise.
* gcc.target/riscv/zksed64.c: Likewise.
* gcc.target/riscv/zknd32-2.c: New test
* gcc.target/riscv/zknd64-2.c: Likewise.
* gcc.target/riscv/zkne32-2.c: Likewise.
* gcc.target/riscv/zkne64-2.c: Likewise.
* gcc.target/riscv/zksed32-2.c: Likewise.
* gcc.target/riscv/zksed64-2.c: Likewise.
Co-authored-by: Liao Shihua <shihua@iscas.ac.cn>
|
|
Come back to review the codes of gather/scatter, notice gather_scatter_valid_offset_mode_p looks odd.
gather_scatter_valid_offset_mode_p is supposed to block vluxei64/vsuxei64 in RV32 system.
However, it failed to do that since it is passing data_mode instead of index mode:
riscv_vector::gather_scatter_valid_offset_mode_p (<RATIO2:MODE>mode)
It should be RATIO2I instead of RATIO2.
So we have this following iterators which already can block the this situation:
(define_mode_iterator RATIO8I [
RVVM1QI
RVVM2HI
RVVM4SI
(RVVM8DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
])
We can see TARGET_64BIT to block EEW64 index mode on RV32 system.
So, gather_scatter_valid_offset_mode_p is no longer needed.
After remove it, I find due to incorrect gather_scatter_valid_offset_mode_p.
We failed to vectorize such case in RV32 in the past:
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
T (int64_t, 8)
TEST_ALL (TEST_LOOP)
https://godbolt.org/z/T3ara3fM3
Checked compiler explorer, we can see GCC failed to vectorize it but Clang can vectorize it.
So adapt the tests checking vectorization cases from 8 -> 11.
Confirm we have same behavior as Clang now.
Tested on zvl128/zvl256/zvl512/zvl1024 no regression.
Note this is not an optimization patch, it's buggy codes fix patch.
gcc/ChangeLog:
* config/riscv/autovec.md
(mask_len_gather_load<RATIO1:mode><RATIO1:mode>):
Remove gather_scatter_valid_offset_mode_p.
(mask_len_gather_load<mode><mode>): Ditto.
(mask_len_scatter_store<RATIO1:mode><RATIO1:mode>): Ditto.
(mask_len_scatter_store<mode><mode>): Ditto.
* config/riscv/predicates.md (const_1_or_8_operand): New predicate.
(vector_gs_scale_operand_64): Remove.
* config/riscv/riscv-protos.h (gather_scatter_valid_offset_mode_p): Remove.
* config/riscv/riscv-v.cc (expand_gather_scatter): Refine code.
(gather_scatter_valid_offset_mode_p): Remove.
* config/riscv/vector-iterators.md: Fix iterator bugs.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-1.c: Adapt test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-9.c: Ditto.
|
|
Remove our RISC-V-specific `order_operator' predicate, which is exactly
the same as generic `ordered_comparison_operator' one.
gcc/
* config/riscv/predicates.md (order_operator): Remove predicate.
* config/riscv/riscv.cc (riscv_rtx_costs): Update accordingly.
* config/riscv/riscv.md (*branch<mode>, *mov<GPR:mode><X:mode>cc)
(cstore<mode>4): Likewise.
|
|
Do not expand floating-point conditional-branch RTL instructions right
away that use a comparison operation that is either directly available
as a machine conditional-set instruction or is NE, which can be emulated
by EQ. This is so that if-conversion sees them in their original form
and can produce fewer operations tried in a branchless code sequence
compared to when such an instruction has been already converted to a
sequence of a floating-point conditional-set RTL instruction followed by
an integer conditional-branch RTL instruction. Split any floating-point
conditional-branch RTL instructions still remaining after reload then.
Adjust the testsuite accordingly: since the middle end uses the inverse
condition internally, an inverse conditional-set instruction may make it
to assembly output and also `cond_move_process_if_block' will be used by
if-conversion rather than `noce_process_if_block', because the latter
function not yet been updated to handle inverted conditions.
gcc/
* config/riscv/predicates.md (ne_operator): New predicate.
* config/riscv/riscv.cc (riscv_insn_cost): Handle branches on a
floating-point condition.
* config/riscv/riscv.md (@cbranch<mode>4): Rename expander to...
(@cbranch<ANYF:mode>4): ... this. Only expand the RTX via
`riscv_expand_conditional_branch' for `!signed_order_operator'
operators, otherwise let it through.
(*cbranch<ANYF:mode>4, *cbranch<ANYF:mode>4): New insns and
splitters.
gcc/testsuite/
* gcc.target/riscv/movdifge-sfb.c: Reject "if-conversion
succeeded through" rather than accepting it.
* gcc.target/riscv/movdifge-thead.c: Likewise.
* gcc.target/riscv/movdifge-ventana.c: Likewise.
* gcc.target/riscv/movdifge-zicond.c: Likewise.
* gcc.target/riscv/movdifgt-sfb.c: Likewise.
* gcc.target/riscv/movdifgt-thead.c: Likewise.
* gcc.target/riscv/movdifgt-ventana.c: Likewise.
* gcc.target/riscv/movdifgt-zicond.c: Likewise.
* gcc.target/riscv/movdifle-sfb.c: Likewise.
* gcc.target/riscv/movdifle-thead.c: Likewise.
* gcc.target/riscv/movdifle-ventana.c: Likewise.
* gcc.target/riscv/movdifle-zicond.c: Likewise.
* gcc.target/riscv/movdiflt-sfb.c: Likewise.
* gcc.target/riscv/movdiflt-thead.c: Likewise.
* gcc.target/riscv/movdiflt-ventana.c: Likewise.
* gcc.target/riscv/movdiflt-zicond.c: Likewise.
* gcc.target/riscv/movsifge-sfb.c: Likewise.
* gcc.target/riscv/movsifge-thead.c: Likewise.
* gcc.target/riscv/movsifge-ventana.c: Likewise.
* gcc.target/riscv/movsifge-zicond.c: Likewise.
* gcc.target/riscv/movsifgt-sfb.c: Likewise.
* gcc.target/riscv/movsifgt-thead.c: Likewise.
* gcc.target/riscv/movsifgt-ventana.c: Likewise.
* gcc.target/riscv/movsifgt-zicond.c: Likewise.
* gcc.target/riscv/movsifle-sfb.c: Likewise.
* gcc.target/riscv/movsifle-thead.c: Likewise.
* gcc.target/riscv/movsifle-ventana.c: Likewise.
* gcc.target/riscv/movsifle-zicond.c: Likewise.
* gcc.target/riscv/movsiflt-sfb.c: Likewise.
* gcc.target/riscv/movsiflt-thead.c: Likewise.
* gcc.target/riscv/movsiflt-ventana.c: Likewise.
* gcc.target/riscv/movsiflt-zicond.c: Likewise.
* gcc.target/riscv/smax-ieee.c: Also accept FLT.D.
* gcc.target/riscv/smaxf-ieee.c: Also accept FLT.S.
* gcc.target/riscv/smin-ieee.c: Also accept FGT.D.
* gcc.target/riscv/sminf-ieee.c: Also accept FGT.S.
|
|
Provide RTL expansion of conditional-move operations for generic targets
using a suitable sequence of base integer machine instructions according
to cost evaluation by if-conversion. Add `-mmovcc' command line option
to enable this transformation, off by default.
For the generic sequences small immediates as per the `arith_operand'
predicate are cost-equivalent to registers as we can use them as input,
alternative to a register, to the respective AND[I] machine operations,
however we need to reject immediates fulfilling `lui_operand', because
they would require reloading into a register, making the operation more
costly. Therefore add `movcc_operand' predicate and use it accordingly.
There is a need to adjust zbs-bext-02.c, which can also serve as emitted
code example, because with certain compilation options an AND operation
can now legitimately appear in output despite BEXT having been produced
as expected, such as with `-march=rv64gc -O2':
foo:
mv a3,a0
li a5,0
mv a0,a1
li a2,64
li a1,1
.L3:
sll a4,a1,a5
and a4,a4,a3
addiw a5,a5,1
beq a4,zero,.L2
addiw a0,a0,1
.L2:
bne a5,a2,.L3
ret
vs `-march=rv64gc_zbs -O2':
foo:
mv a4,a0
li a5,0
mv a0,a1
li a3,64
.L3:
bext a2,a4,a5
beq a2,zero,.L2
addiw a0,a0,1
.L2:
addiw a5,a5,1
bne a5,a3,.L3
ret
and then with `-march=rv64gc -mmovcc -mbranch-cost=7':
foo:
mv a6,a0
li a4,0
mv a0,a1
li a7,1
li a1,64
.L3:
sll a5,a7,a4
and a5,a5,a6
snez a5,a5
neg a5,a5
not a2,a5
addiw a3,a0,1
and a5,a5,a3
and a0,a2,a0
addiw a4,a4,1
or a0,a5,a0
bne a4,a1,.L3
ret
vs `-march=rv64gc_zbs -mmovcc -mbranch-cost=7':
foo:
mv a6,a0
li a4,0
mv a0,a1
li a1,64
.L3:
bext a5,a6,a4
neg a5,a5
not a2,a5
addiw a3,a0,1
and a5,a5,a3
and a0,a2,a0
addiw a4,a4,1
or a0,a5,a0
bne a4,a1,.L3
ret
However BEXT is supposed to replace an SLL operation so adjust the test
case to reject SLL rather than AND, letting the test case pass even with
`/-mmovcc/-mbranch-cost=7' specified as DejaGNU test flags (and in the
absence of target-specific conditional-move operations enabled either by
default or with other test flags).
gcc/
* config/riscv/predicates.md (movcc_operand): New predicate.
* config/riscv/riscv.cc (riscv_expand_conditional_move): Handle
generic targets.
* config/riscv/riscv.md (mov<mode>cc): Likewise.
* config/riscv/riscv.opt (mmovcc): New option.
* doc/invoke.texi (Option Summary): Document it.
gcc/testsuite/
* gcc.target/riscv/zbs-bext-02.c: Adjust to reject SLL rather
than AND.
|
|
An ICE was discovered in recent rounding autovec support:
config/riscv/riscv-v.cc:4314
65 | }
| ^
0x1fa5223 riscv_vector::validate_change_or_fail(rtx_def*, rtx_def**,
rtx_def*, bool)
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-v.cc:4314
0x1fb1aa2 pre_vsetvl::remove_avl_operand()
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3342
0x1fb18c1 pre_vsetvl::cleaup()
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3308
0x1fb216d pass_vsetvl::lazy_vsetvl()
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3480
0x1fb2214 pass_vsetvl::execute(function*)
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3504
The root cause is that the RA reload into (set (reg) vec_duplicate:DI). However, it is not valid in RV32 system
since we don't have a single broadcast instruction DI scalar in RV32 system.
We should expand it early for RV32 system.
gcc/ChangeLog:
* config/riscv/predicates.md: Adapt predicate.
* config/riscv/riscv-protos.h (can_be_broadcasted_p): New function.
* config/riscv/riscv-v.cc (can_be_broadcasted_p): Ditto.
* config/riscv/vector.md (vec_duplicate<mode>): New pattern.
(*vec_duplicate<mode>): Adapt vec_duplicate insn pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/sew64-rv32.c: New test.
|
|
Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md
Contributors:
Mary Bennett <mary.bennett@embecosm.com>
Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Pietra Ferreira <pietra.ferreira@embecosm.com>
Charlie Keaney
Jessica Mills
Craig Blackmore <craig.blackmore@embecosm.com>
Simon Cook <simon.cook@embecosm.com>
Jeremy Bennett <jeremy.bennett@embecosm.com>
Helene Chelin <helene.chelin@embecosm.com>
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add the XCValu
extension.
* config/riscv/constraints.md: Add builtins for the XCValu
extension.
* config/riscv/predicates.md (immediate_register_operand):
Likewise.
* config/riscv/corev.def: Likewise.
* config/riscv/corev.md: Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
(RISCV_ATYPE_UHI): Likewise.
* config/riscv/riscv-ftypes.def: Likewise.
* config/riscv/riscv.opt: Likewise.
* config/riscv/riscv.cc (riscv_print_operand): Likewise.
* doc/extend.texi: Add XCValu documentation.
* doc/sourcebuild.texi: Likewise.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Add proc for the XCValu extension.
* gcc.target/riscv/cv-alu-compile.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addrn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addun.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addurn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-clip.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-clipu.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-subn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-subrn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-subun.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-suburn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile.c: New test.
|
|
Now GCC middle-end can support undefined value which is traslated into (scratch:mode).
This patch is to enable RISC-V backend undefine value in ELSE value of COND_LEN_xxx/COND_xxx.
Consider this following case:
__attribute__((noipa))
void vrem_int8_t (int8_t * __restrict dst, int8_t * __restrict a, int8_t * __restrict b, int n)
{
for (int i = 0; i < n; i++)
dst[i] = a[i] % b[i];
}
Before this patch:
vrem_int8_t:
ble a3,zero,.L5
vsetvli a5,zero,e8,m1,ta,ma
vmv.v.i v4,0 ---> redundant.
.L3:
vsetvli a5,a3,e8,m1,tu,ma ---> should be TA.
vmv1r.v v1,v4 ---> redudant.
vle8.v v3,0(a1)
vle8.v v2,0(a2)
sub a3,a3,a5
vrem.vv v1,v3,v2
vse8.v v1,0(a0)
add a1,a1,a5
add a2,a2,a5
add a0,a0,a5
bne a3,zero,.L3
.L5:
ret
After this patch:
vrem_int8_t:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e8,m1,ta,ma
vle8.v v1,0(a1)
vle8.v v2,0(a2)
sub a3,a3,a5
vrem.vv v1,v1,v2
vse8.v v1,0(a0)
add a1,a1,a5
add a2,a2,a5
add a0,a0,a5
bne a3,zero,.L3
.L5:
ret
PR target/110751
gcc/ChangeLog:
* config/riscv/autovec.md: Enable scratch rtx in ELSE operand.
* config/riscv/predicates.md (autovec_else_operand): New predicate.
* config/riscv/riscv-v.cc (get_else_operand): New function.
(expand_cond_len_unop): Adapt ELSE value.
(expand_cond_len_binop): Ditto.
(expand_cond_len_ternop): Ditto.
* config/riscv/riscv.cc (riscv_preferred_else_value): New function.
(TARGET_PREFERRED_ELSE_VALUE): New targethook.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adapt test.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-9.c: Ditto.
|
|
When stride == element width, vlsse should be optimized into vle.v.
vsse should be optimized into vse.v.
PR target/111450
gcc/ChangeLog:
* config/riscv/constraints.md (c01): const_int 1.
(c02): const_int 2.
(c04): const_int 4.
(c08): const_int 8.
* config/riscv/predicates.md (vector_eew8_stride_operand): New predicate for stride operand.
(vector_eew16_stride_operand): Ditto.
(vector_eew32_stride_operand): Ditto.
(vector_eew64_stride_operand): Ditto.
* config/riscv/vector-iterators.md: New iterator for stride operand.
* config/riscv/vector.md: Add stride = element width constraint.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr111450.c: New test.
|
|
This little rename vector_gs_scale_operand_16/32 to more generic names
const_1_or_2/4_operand. So it's a little better understood when offered
for use elsewhere.
gcc/ChangeLog:
* config/riscv/predicates.md (const_1_or_2_operand): Rename.
(const_1_or_4_operand): Ditto.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
* config/riscv/vector-iterators.md: Adjust.
|
|
The code changes are from Palmer.
root cause:
In a gcc build with --enable-checking=yes, REGNO (op) checks
rtx code and expected code 'reg'. so a rtx with 'subreg' causes
an internal compiler error.
solution:
Restrict predicate to allow 'reg' only.
gcc/ChangeLog:
* config/riscv/predicates.md: Restrict predicate
to allow 'reg' only.
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The vector shift immediates happen to have the same constraints as some
of the CSR-related operands, but it's a different usage. This adds a
name for them, so I don't get confused again next time.
gcc/ChangeLog:
* config/riscv/autovec.md (shifts): Use
vector_scalar_shift_operand.
* config/riscv/predicates.md (vector_scalar_shift_operand): New
predicate.
|
|
Signed-off-by: Die Li <lidie@eswincomputing.com>
Co-Authored-By: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:
* config/riscv/peephole.md: New pattern.
* config/riscv/predicates.md (a0a1_reg_operand): New predicate.
(zcmp_mv_sreg_operand): New predicate.
* config/riscv/riscv.md: New predicate.
* config/riscv/zc.md (*mva01s<X:mode>): New pattern.
(*mvsa01<X:mode>): New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cm_mv_rv32.c: New test.
|
|
Zcmp can share the same logic as save-restore in stack allocation: pre-allocation
by cm.push, step 1 and step 2.
Pre-allocation not only saves callee saved GPRs, but also saves callee saved FPRs and
local variables if any.
Please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does.
So adaption has been done in .cfi directives in my patch.
gcc/ChangeLog:
* config/riscv/iterators.md
(slot0_offset): slot 0 offset in stack GPRs area in bytes
(slot1_offset): slot 1 offset in stack GPRs area in bytes
(slot2_offset): likewise
(slot3_offset): likewise
(slot4_offset): likewise
(slot5_offset): likewise
(slot6_offset): likewise
(slot7_offset): likewise
(slot8_offset): likewise
(slot9_offset): likewise
(slot10_offset): likewise
(slot11_offset): likewise
(slot12_offset): likewise
* config/riscv/predicates.md
(stack_push_up_to_ra_operand): predicates of stack adjust pushing ra
(stack_push_up_to_s0_operand): predicates of stack adjust pushing ra, s0
(stack_push_up_to_s1_operand): likewise
(stack_push_up_to_s2_operand): likewise
(stack_push_up_to_s3_operand): likewise
(stack_push_up_to_s4_operand): likewise
(stack_push_up_to_s5_operand): likewise
(stack_push_up_to_s6_operand): likewise
(stack_push_up_to_s7_operand): likewise
(stack_push_up_to_s8_operand): likewise
(stack_push_up_to_s9_operand): likewise
(stack_push_up_to_s11_operand): likewise
(stack_pop_up_to_ra_operand): predicates of stack adjust poping ra
(stack_pop_up_to_s0_operand): predicates of stack adjust poping ra, s0
(stack_pop_up_to_s1_operand): likewise
(stack_pop_up_to_s2_operand): likewise
(stack_pop_up_to_s3_operand): likewise
(stack_pop_up_to_s4_operand): likewise
(stack_pop_up_to_s5_operand): likewise
(stack_pop_up_to_s6_operand): likewise
(stack_pop_up_to_s7_operand): likewise
(stack_pop_up_to_s8_operand): likewise
(stack_pop_up_to_s9_operand): likewise
(stack_pop_up_to_s11_operand): likewise
* config/riscv/riscv-protos.h
(riscv_zcmp_valid_stack_adj_bytes_p):declaration
* config/riscv/riscv.cc (struct riscv_frame_info): comment change
(riscv_avoid_multi_push): helper function of riscv_use_multi_push
(riscv_use_multi_push): true if multi push is used
(riscv_multi_push_sregs_count): num of sregs in multi-push
(riscv_multi_push_regs_count): num of regs in multi-push
(riscv_16bytes_align): align to 16 bytes
(riscv_stack_align): moved to a better place
(riscv_save_libcall_count): no functional change
(riscv_compute_frame_info): add zcmp frame info
(riscv_for_each_saved_reg): save or restore fprs in specified slot for zcmp
(riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push
(riscv_gen_multi_push_pop_insn): gen function for multi push and pop
(get_multi_push_fpr_mask): get mask for the fprs pushed by cm.push
(riscv_expand_prologue): allocate stack by cm.push
(riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret]
(riscv_expand_epilogue): allocate stack by cm.pop[ret]
(zcmp_base_adj): calculate stack adjustment base size
(zcmp_additional_adj): calculate stack adjustment additional size
(riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment valid
* config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra
(S0_MASK): likewise
(S1_MASK): likewise
(S2_MASK): likewise
(S3_MASK): likewise
(S4_MASK): likewise
(S5_MASK): likewise
(S6_MASK): likewise
(S7_MASK): likewise
(S8_MASK): likewise
(S9_MASK): likewise
(S10_MASK): likewise
(S11_MASK): likewise
(MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most
(ZCMP_MAX_SPIMM): max spimm value
(ZCMP_SP_INC_STEP): zcmp sp increment step
(ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10
(ZCMP_S0S11_SREGS_COUNTS): num of s0-s11
(ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp
(CALLEE_SAVED_FREG_NUMBER): get x of fsx(fs0 ~ fs11)
* config/riscv/riscv.md: include zc.md
* config/riscv/zc.md: New file. machine description for zcmp
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rv32e_zcmp.c: New test.
* gcc.target/riscv/rv32i_zcmp.c: New test.
* gcc.target/riscv/zcmp_push_fpr.c: New test.
* gcc.target/riscv/zcmp_stack_alignment.c: New test.
|
|
This patch fix PR110943 which will produce some error code. This is because
the error combine of some pred_mov pattern. Consider this code:
```
void foo9 (void *base, void *out, size_t vl)
{
int64_t scalar = *(int64_t*)(base + 100);
vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
*(vint64m2_t*)out = v;
}
```
RTL before combine pass:
```
(insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(const_vector:RVVM2DI repeat [
(const_int 0 [0])
])
(unspec:RVVM2DI [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di})
(insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
(reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 {*movrvvm2di_whole})
```
RTL after combine pass:
```
(insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
(const_vector:RVVMF32BI repeat [
(const_int 1 [0x1])
])
(const_int 1 [0x1])
(const_int 2 [0x2]) repeated x2
(const_int 0 [0])
(reg:SI 66 vl)
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)
(const_vector:RVVM2DI repeat [
(const_int 0 [0])
])
(unspec:RVVM2DI [
(reg:SI 0 zero)
] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di})
```
This combine change the semantics of insn 14. I split @pred_mov pattern and
restrict the conditon of @pred_mov.
PR target/110943
gcc/ChangeLog:
* config/riscv/predicates.md (vector_const_int_or_double_0_operand):
New predicate.
* config/riscv/riscv-vector-builtins.cc (function_expander::function_expander):
force_reg mem target operand.
* config/riscv/vector.md (@pred_mov<mode>): Wrapper.
(*pred_mov<mode>): Remove imm -> reg pattern.
(*pred_broadcast<mode>_imm): Add imm -> reg pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Adjust.
* gcc.target/riscv/rvv/base/pr110943.c: New test.
|
|
Hi,
This patch revert the convert from vmv.s.x to vmv.v.i and add new pattern
optimize the special case when the scalar operand is zero.
Currently, the broadcast pattern where the scalar operand is a imm
will be converted to vmv.v.i from vmv.s.x and the mask operand will be
converted from 00..01 to 11..11. There are some advantages and
disadvantages before and after the conversion after discussing
with Juzhe offline and we chose not to do this transform.
Before:
Advantages: The vsetvli info required by vmv.s.x has better compatibility since
vmv.s.x only required SEW and VLEN be zero or one. That mean there
is more opportunities to combine with other vsetlv infos in vsetvl pass.
Disadvantages: For non-zero scalar imm, one more `li rd, imm` instruction
will be needed.
After:
Advantages: No need `li rd, imm` instruction since vmv.v.i support imm operand.
Disadvantages: Like before's advantages. Worse compatibility leads to more
vsetvl instrunctions need.
Consider the bellow C code and asm after autovec.
there is an extra insn (vsetivli zero, 1, e32, m1, ta, ma)
after converted vmv.s.x to vmv.v.i.
```
int foo1(int* restrict a, int* restrict b, int *restrict c, int n) {
int sum = 0;
for (int i = 0; i < n; i++)
sum += a[i] * b[i];
return sum;
}
```
asm (Before):
```
foo1:
ble a3,zero,.L7
vsetvli a2,zero,e32,m1,ta,ma
vmv.v.i v1,0
.L6:
vsetvli a5,a3,e32,m1,tu,ma
slli a4,a5,2
sub a3,a3,a5
vle32.v v2,0(a0)
vle32.v v3,0(a1)
add a0,a0,a4
add a1,a1,a4
vmacc.vv v1,v3,v2
bne a3,zero,.L6
vsetvli a2,zero,e32,m1,ta,ma
vmv.s.x v2,zero
vredsum.vs v1,v1,v2
vmv.x.s a0,v1
ret
.L7:
li a0,0
ret
```
asm (After):
```
foo1:
ble a3,zero,.L4
vsetvli a2,zero,e32,m1,ta,ma
vmv.v.i v1,0
.L3:
vsetvli a5,a3,e32,m1,tu,ma
slli a4,a5,2
sub a3,a3,a5
vle32.v v2,0(a0)
vle32.v v3,0(a1)
add a0,a0,a4
add a1,a1,a4
vmacc.vv v1,v3,v2
bne a3,zero,.L3
vsetivli zero,1,e32,m1,ta,ma
vmv.v.i v2,0
vsetvli a2,zero,e32,m1,ta,ma
vredsum.vs v1,v1,v2
vmv.x.s a0,v1
ret
.L4:
li a0,0
ret
```
Best,
Lehua
Co-Authored-By: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
gcc/ChangeLog:
* config/riscv/predicates.md (vector_const_0_operand): New.
* config/riscv/vector.md (*pred_broadcast<mode>_zero): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/scalar_move-5.c: Update.
* gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto.
|
|
Currently, autovec_length_operand predicate incorrect configuration is
discovered in PR110989 since this following situation:
vect__6.24_107 = .MASK_LEN_LOAD (vectp.22_105, 32B, mask__49.21_99, POLY_INT_CST [2, 2], 0); ---> dummy length = VF.
The current autovec length operand failed to recognize the VF dummy length.
-march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=scalable -Ofast -fno-schedule-insns -fno-schedule-insns2:
Before this patch:
srli a4,s0,2
addi a4,a4,-3
srli s0,s0,3
vsetvli a5,zero,e64,m1,ta,ma
vid.v v1
vmul.vx v1,v1,a4
addi a4,s0,-2
vadd.vx v1,v1,a4
addi a4,s0,-1
vslide1up.vx v2,v1,a4
vmv.v.x v1,a4
vand.vv v1,v2,v1
vl1re64.v v3,0(t2)
vrgather.vv v2,v3,v1
vmv.v.i v1,0
vmfeq.vv v0,v2,v1
vsetvli zero,s0,e32,mf2,ta,ma ---> s0 = POLY (2,2)
vle32.v v3,0(t3),v0.t
vsetvli a5,zero,e64,m1,ta,ma
vmfne.vv v0,v2,v1
vsetvli zero,zero,e32,mf2,ta,ma
vfwcvt.f.x.v v1,v3
vsetvli zero,zero,e64,m1,ta,ma
vmerge.vvm v1,v1,v2,v0
vslidedown.vx v1,v1,a4
vfmv.f.s fa5,v1
j .L6
After this patch:
srli a4,s0,2
addi a4,a4,-3
srli s0,s0,3
vsetvli a5,zero,e64,m1,ta,ma
vid.v v1
vmul.vx v1,v1,a4
addi a4,s0,-2
vadd.vx v1,v1,a4
addi s0,s0,-1
vslide1up.vx v2,v1,s0
vmv.v.x v1,s0
vand.vv v1,v2,v1
vl1re64.v v3,0(t2)
vrgather.vv v2,v3,v1
vmv.v.i v1,0
vmfeq.vv v0,v2,v1
vle32.v v3,0(t3),v0.t
vmfne.vv v0,v2,v1
vsetvli zero,zero,e32,mf2,ta,ma
vfwcvt.f.x.v v1,v3
vsetvli zero,zero,e64,m1,ta,ma
vmerge.vvm v1,v1,v2,v0
vslidedown.vx v1,v1,s0
vfmv.f.s fa5,v1
j .L6
2 vsetvli insns are reduced.
gcc/ChangeLog:
PR target/110989
* config/riscv/predicates.md: Fix predicate.
gcc/testsuite/ChangeLog:
PR target/110989
* gcc.target/riscv/rvv/autovec/pr110989.c: Add vsetvli assembly check.
|
|
Fixes: ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT")
DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it.
void zd(double *) { *d = 0.0; }
currently:
| fmv.d.x fa5,zero
| fsd fa5,0(a0)
| ret
With patch
| sd zero,0(a0)
| ret
The fix updates predicate const_0_operand() so reg_or_0_operand () now
includes const_double, enabling movdf expander -> riscv_legitimize_move ()
to generate below vs. an intermediate set (reg:DF) const_double:DF
| (insn 6 3 0 2 (set (mem:DF (reg/v/f:DI 134 [ d ])
| (const_double:DF 0.0 [0x0.0p+0]))
This change also enables such insns to be recog() by later passes.
The md pattern "*movdf_hardfloat_rv64" despite already supporting the
needed constraints {"m","G"} mem/const 0.0 was failing to match because
the additional condition check reg_or_0_operand() was failing due to
missing const_double.
This failure to recog() was triggering an ICE when testing the in-flight
f-m-o patches and is how all of this started, but then was deemed to be
an independent optimization of it's own [1].
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html
Its worthwhile to note all the set peices were already there and working
up until my own commit mentioned at top regressed the whole thing.
Ran thru full multilib testsuite and no surprises. There was 1 false
failure due to random string "lw" appearing in lto build assembler output,
which is also fixed here.
gcc/ChangeLog:
PR target/110748
* config/riscv/predicates.md (const_0_operand): Add back
const_double.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr110748-1.c: New Test.
* gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test
patterns to avoid random string matches.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
This patch fully support gather_load/scatter_store:
1. Support single-rgroup on both RV32/RV64.
2. Support indexed element width can be same as or smaller than Pmode.
3. Support VLA SLP with gather/scatter.
4. Fully tested all gather/scatter with LMUL = M1/M2/M4/M8 both VLA and VLS.
5. Fix bug of handling (subreg:SI (const_poly_int:DI))
6. Fix bug on vec_perm which is used by gather/scatter SLP.
All kinds of GATHER/SCATTER are normalized into LEN_MASK_*.
We fully supported these 4 kinds of gather/scatter:
1. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and dummy mask (Full vector).
2. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and real mask.
3. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and dummy mask.
4. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and real mask.
Base on the disscussions with Richards, we don't lower vlse/vsse in RISC-V backend for strided load/store.
Instead, we leave it to the middle-end to handle that.
Regression is pass ok for trunk ?
gcc/ChangeLog:
* config/riscv/autovec.md
(len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>): New pattern.
(len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
(len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
(len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
(len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
(len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
(len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
(len_mask_gather_load<mode><mode>): Ditto.
(len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
(len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
(len_mask_scatter_store<mode><mode>): Ditto.
* config/riscv/predicates.md (const_1_operand): New predicate.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
(vector_gs_scale_operand_64): Ditto.
(vector_gs_extension_operand): Ditto.
(vector_gs_scale_operand_16_rv32): Ditto.
(vector_gs_scale_operand_32_rv32): Ditto.
* config/riscv/riscv-protos.h (enum insn_type): Add gather/scatter.
(expand_gather_scatter): New function.
* config/riscv/riscv-v.cc (gen_const_vector_dup): Add gather/scatter.
(emit_vlmax_masked_store_insn): New function.
(emit_nonvlmax_masked_store_insn): Ditto.
(modulo_sel_indices): Ditto.
(expand_vec_perm): Fix SLP for gather/scatter.
(prepare_gather_scatter): New function.
(expand_gather_scatter): Ditto.
* config/riscv/riscv.cc (riscv_legitimize_move): Fix bug of
(subreg:SI (DI CONST_POLY_INT)).
* config/riscv/vector-iterators.md: Add gather/scatter.
* config/riscv/vector.md (vec_duplicate<mode>): Use "@" instead.
(@vec_duplicate<mode>): Ditto.
(@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>):
Fix name.
(@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/rvv.exp: Add gather/scatter tests.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c:
New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c:
New test.
|
|
This patch enable len_mask_{load,store} to support flow-control in RVV auto-vectorization.
Consider this following case:
void
f (int32_t *__restrict a,
int32_t *__restrict b,
int32_t *__restrict cond,
int n)
{
for (int i = 0; i < n; i++)
if (cond[i])
a[i] = b[i];
}
Before this patch:
<source>:9:21: missed: couldn't vectorize loop
<source>:9:21: missed: not vectorized: control flow in loop.
After this patch:
f:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e32,m1,ta,ma
vle32.v v0,0(a2)
vsetvli a6,zero,e32,m1,ta,ma
slli a4,a5,2
vmsne.vi v0,v0,0
sub a3,a3,a5
vsetvli zero,a5,e32,m1,ta,ma
vle32.v v1,0(a1),v0.t
vse32.v v1,0(a0),v0.t
add a2,a2,a4
add a1,a1,a4
add a0,a0,a4
bne a3,zero,.L3
.L5:
ret
gcc/ChangeLog:
* config/riscv/autovec.md (len_load_<mode>): Remove.
(len_maskload<mode><vm>): Remove.
(len_store_<mode>): New pattern.
(len_maskstore<mode><vm>): New pattern.
* config/riscv/predicates.md (autovec_length_operand): New predicate.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(expand_load_store): New function.
* config/riscv/riscv-v.cc (emit_vlmax_masked_insn): Ditto.
(emit_nonvlmax_masked_insn): Ditto.
(expand_load_store): Ditto.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_contiguous_store_insn): Add avl_type operand
into pred_store.
* config/riscv/vector.md: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/single_rgroup-2.h: New test.
* gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.c: New test.
* gcc.target/riscv/rvv/autovec/partial/single_rgroup-3.h: New test.
* gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/single_rgroup_run-3.c: New test.
|
|
Notice there is warning in predicates.md:
../../../riscv-gcc/gcc/config/riscv/predicates.md: In function ‘bool arith_operand_or_mode_mask(rtx, machine_mode)’:
../../../riscv-gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
(match_test "INTVAL (op) == GET_MODE_MASK (HImode)
../../../riscv-gcc/gcc/config/riscv/predicates.md:34:20: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
|| INTVAL (op) == GET_MODE_MASK (SImode)"))))
gcc/ChangeLog:
* config/riscv/predicates.md: Change INTVAL into UINTVAL.
|
|
Notice there is warning in predicates.md:
../../../riscv-gcc/gcc/config/riscv/predicates.md: In function ‘bool arith_operand_or_mode_mask(rtx, machine_mode)’:
../../../riscv-gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
(match_test "INTVAL (op) == GET_MODE_MASK (HImode)
../../../riscv-gcc/gcc/config/riscv/predicates.md:34:20: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
|| INTVAL (op) == GET_MODE_MASK (SImode)"))))
gcc/ChangeLog:
* config/riscv/predicates.md: Change INTVAL into UINTVAL.
|
|
This patch supports vector permutation for VLS only by vec_perm pattern.
We will support TARGET_VECTORIZE_VEC_PERM_CONST to support VLA permutation
in the future.
Fixed following comments from Robin.
gcc/ChangeLog:
* config/riscv/autovec.md (vec_perm<mode>): New pattern.
* config/riscv/predicates.md (vector_perm_operand): New predicate.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(expand_vec_perm): New function.
* config/riscv/riscv-v.cc (const_vec_all_in_range_p): Ditto.
(gen_const_vector_dup): Ditto.
(emit_vlmax_gather_insn): Ditto.
(emit_vlmax_masked_gather_mu_insn): Ditto.
(expand_vec_perm): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm.h: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-7.c: New test.
|
|
In the case where the target supports extension instructions,
it is preferable to use that instead of doing the same in other ways.
For the following case
void foo (unsigned long a, unsigned long* ptr) {
ptr[0] = a & 0xffffffffUL;
ptr[1] &= 0xffffffffUL;
}
GCC generates
foo:
li a5,-1
srli a5,a5,32
and a0,a0,a5
sd a0,0(a1)
ld a4,8(a1)
and a5,a4,a5
sd a5,8(a1)
ret
but it will be profitable to generate this one
foo:
zext.w a0,a0
sd a0,0(a1)
lwu a5,8(a1)
sd a5,8(a1)
ret
This patch fixes mentioned issue.
It supports HI -> DI, HI->SI and SI -> DI extensions.
gcc/ChangeLog:
* config/riscv/riscv.md (and<mode>3): New expander.
(*and<mode>3) New pattern.
* config/riscv/predicates.md (arith_operand_or_mode_mask): New
predicate.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/and-extend-1.c: New test
* gcc.target/riscv/and-extend-2.c: New test
|
|
Rotate instructions do not need to mask the third operand.
For example, RV64 the following code:
unsigned long foo1(unsigned long rs1, unsigned long rs2)
{
long shamt = rs2 & (64 - 1);
return (rs1 << shamt) | (rs1 >> ((64 - shamt) & (64 - 1)));
}
Compiles to:
foo1:
andi a1,a1,63
rol a0,a0,a1
ret
This patch removes unnecessary masking.
Besides, I have merged masking insns for shifts that were written before.
gcc/ChangeLog:
* config/riscv/riscv.md (*<optab><GPR:mode>3_mask): New pattern,
combined from ...
(*<optab>si3_mask, *<optab>di3_mask): Here.
(*<optab>si3_mask_1, *<optab>di3_mask_1): And here.
* config/riscv/bitmanip.md (*<bitmanip_optab><GPR:mode>3_mask): New
pattern.
(*<bitmanip_optab>si3_sext_mask): Likewise.
* config/riscv/iterators.md (shiftm1): Use const_si_mask_operand
and const_di_mask_operand.
(bitmanip_rotate): New iterator.
(bitmanip_optab): Add rotates.
* config/riscv/predicates.md (const_si_mask_operand): Renamed
from const31_operand. Generalize to handle more mask constants.
(const_di_mask_operand): Similarly.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/shift-and-2.c: Fixed test
* gcc.target/riscv/zbb-rol-ror-01.c: New test
* gcc.target/riscv/zbb-rol-ror-02.c: New test
* gcc.target/riscv/zbb-rol-ror-03.c: New test
* gcc.target/riscv/zbb-rol-ror-04.c: New test
* gcc.target/riscv/zbb-rol-ror-05.c: New test
* gcc.target/riscv/zbb-rol-ror-06.c: New test
* gcc.target/riscv/zbb-rol-ror-07.c: New test
|
|
I have noticed that in the case when we try to clear two bits through a
small constant,
and ZBS is enabled then GCC split it into two "andi" instructions.
For example for the following C code:
int foo(int a) {
return a & ~ 0x101;
}
GCC generates the following:
foo:
andi a0,a0,-2
andi a0,a0,-257
ret
but should be this one:
foo:
andi a0,a0,-258
ret
This patch solves the mentioned issue.
gcc/ChangeLog
* config/riscv/bitmanip.md: Updated predicates of bclri<mode>_nottwobits
and bclridisi_nottwobits patterns.
* config/riscv/predicates.md: (not_uimm_extra_bit_or_nottwobits): Adjust
predicate to avoid splitting arith constants.
(const_nottwobits_not_arith_operand): New predicate.
gcc/testsuite
* gcc.target/riscv/zbs-bclri-nottwobits.c: New test.
|
|
there is no need to split an xori/ori with an small constant. take the test
case `int foo(int idx) { return idx|3; }` as an example,
rv64im_zba generates:
ori a0,a0,3
ret
but, rv64im_zba_zbs generates:
ori a0,a0,1
ori a0,a0,2
ret
with this change, insn `ori r2,r1,3` will not be splitted in zbs.
gcc/
* config/riscv/predicates.md (uimm_extra_bit_or_twobits): Adjust
predicate to avoid splitting arith constants.
gcc/testsuite
* gcc.target/riscv/zbs-extra-bit-or-twobits.c: New test.
|
|
Co-authored-by: kito-cheng <kito.cheng@sifive.com>
gcc/ChangeLog:
* config/riscv/predicates.md (vector_any_register_operand): New predicate.
* config/riscv/riscv-c.cc (riscv_check_builtin_call): New function.
(riscv_register_pragmas): Add builtin function check call.
* config/riscv/riscv-protos.h (RVV_VUNDEF): Adapt macro.
(check_builtin_call): New function.
* config/riscv/riscv-vector-builtins-bases.cc (class vundefined): New class.
(class vreinterpret): Ditto.
(class vlmul_ext): Ditto.
(class vlmul_trunc): Ditto.
(class vset): Ditto.
(class vget): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vluxei8): Change name.
(vluxei16): Ditto.
(vluxei32): Ditto.
(vluxei64): Ditto.
(vloxei8): Ditto.
(vloxei16): Ditto.
(vloxei32): Ditto.
(vloxei64): Ditto.
(vsuxei8): Ditto.
(vsuxei16): Ditto.
(vsuxei32): Ditto.
(vsuxei64): Ditto.
(vsoxei8): Ditto.
(vsoxei16): Ditto.
(vsoxei32): Ditto.
(vsoxei64): Ditto.
(vundefined): Add new intrinsic.
(vreinterpret): Ditto.
(vlmul_ext): Ditto.
(vlmul_trunc): Ditto.
(vset): Ditto.
(vget): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct return_mask_def): New class.
(struct narrow_alu_def): Ditto.
(struct reduc_alu_def): Ditto.
(struct vundefined_def): Ditto.
(struct misc_def): Ditto.
(struct vset_def): Ditto.
(struct vget_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_EEW8_INTERPRET_OPS): New def.
(DEF_RVV_EEW16_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW32_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW64_INTERPRET_OPS): Ditto.
(DEF_RVV_X2_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X4_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X8_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X16_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X32_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X64_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_LMUL1_OPS): Ditto.
(DEF_RVV_LMUL2_OPS): Ditto.
(DEF_RVV_LMUL4_OPS): Ditto.
(vint16mf4_t): Ditto.
(vint16mf2_t): Ditto.
(vint16m1_t): Ditto.
(vint16m2_t): Ditto.
(vint16m4_t): Ditto.
(vint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vint32m1_t): Ditto.
(vint32m2_t): Ditto.
(vint32m4_t): Ditto.
(vint32m8_t): Ditto.
(vint64m1_t): Ditto.
(vint64m2_t): Ditto.
(vint64m4_t): Ditto.
(vint64m8_t): Ditto.
(vuint16mf4_t): Ditto.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint16m8_t): Ditto.
(vuint32mf2_t): Ditto.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
(vuint32m8_t): Ditto.
(vuint64m1_t): Ditto.
(vuint64m2_t): Ditto.
(vuint64m4_t): Ditto.
(vuint64m8_t): Ditto.
(vint8mf4_t): Ditto.
(vint8mf2_t): Ditto.
(vint8m1_t): Ditto.
(vint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vint8m8_t): Ditto.
(vuint8mf4_t): Ditto.
(vuint8mf2_t): Ditto.
(vuint8m1_t): Ditto.
(vuint8m2_t): Ditto.
(vuint8m4_t): Ditto.
(vuint8m8_t): Ditto.
(vint8mf8_t): Ditto.
(vuint8mf8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m2_t): Ditto.
(vfloat64m4_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_TYPE): Ditto.
(DEF_RVV_EEW8_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW16_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW32_INTERPRET_OPS): Ditto.
(DEF_RVV_EEW64_INTERPRET_OPS): Ditto.
(DEF_RVV_X2_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X4_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X8_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X16_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X32_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_X64_VLMUL_EXT_OPS): Ditto.
(DEF_RVV_LMUL1_OPS): Ditto.
(DEF_RVV_LMUL2_OPS): Ditto.
(DEF_RVV_LMUL4_OPS): Ditto.
(DEF_RVV_TYPE_INDEX): Ditto.
(required_extensions_p): Adapt for new intrinsic support/
(get_required_extensions): New function.
(check_required_extensions): Ditto.
(unsigned_base_type_p): Remove.
(rvv_arg_type_info::get_scalar_ptr_type): New function.
(get_mode_for_bitsize): Remove.
(rvv_arg_type_info::get_scalar_const_ptr_type): New function.
(rvv_arg_type_info::get_base_vector_type): Ditto.
(rvv_arg_type_info::get_function_type_index): Ditto.
(DEF_RVV_BASE_TYPE): New def.
(function_builder::apply_predication): New class.
(function_expander::mask_mode): Ditto.
(function_checker::function_checker): Ditto.
(function_checker::report_non_ice): Ditto.
(function_checker::report_out_of_range): Ditto.
(function_checker::require_immediate): Ditto.
(function_checker::require_immediate_range): Ditto.
(function_checker::check): Ditto.
(check_builtin_call): Ditto.
* config/riscv/riscv-vector-builtins.def (DEF_RVV_TYPE): New def.
(DEF_RVV_BASE_TYPE): Ditto.
(DEF_RVV_TYPE_INDEX): Ditto.
(vbool64_t): Ditto.
(vbool32_t): Ditto.
(vbool16_t): Ditto.
(vbool8_t): Ditto.
(vbool4_t): Ditto.
(vbool2_t): Ditto.
(vbool1_t): Ditto.
(vuint8mf8_t): Ditto.
(vuint8mf4_t): Ditto.
(vuint8mf2_t): Ditto.
(vuint8m1_t): Ditto.
(vuint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vuint8m4_t): Ditto.
(vint8m8_t): Ditto.
(vuint8m8_t): Ditto.
(vint16mf4_t): Ditto.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
(vuint32m8_t): Ditto.
(vuint64m1_t): Ditto.
(vuint64m2_t): Ditto.
(vuint64m4_t): Ditto.
(vuint64m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m4_t): Ditto.
(vector): Move it def.
(scalar): Ditto.
(mask): Ditto.
(signed_vector): Ditto.
(unsigned_vector): Ditto.
(unsigned_scalar): Ditto.
(vector_ptr): Ditto.
(scalar_ptr): Ditto.
(scalar_const_ptr): Ditto.
(void): Ditto.
(size): Ditto.
(ptrdiff): Ditto.
(unsigned_long): Ditto.
(long): Ditto.
(eew8_index): Ditto.
(eew16_index): Ditto.
(eew32_index): Ditto.
(eew64_index): Ditto.
(shift_vector): Ditto.
(double_trunc_vector): Ditto.
(quad_trunc_vector): Ditto.
(oct_trunc_vector): Ditto.
(double_trunc_scalar): Ditto.
(double_trunc_signed_vector): Ditto.
(double_trunc_unsigned_vector): Ditto.
(double_trunc_unsigned_scalar): Ditto.
(double_trunc_float_vector): Ditto.
(float_vector): Ditto.
(lmul1_vector): Ditto.
(widen_lmul1_vector): Ditto.
(eew8_interpret): Ditto.
(eew16_interpret): Ditto.
(eew32_interpret): Ditto.
(eew64_interpret): Ditto.
(vlmul_ext_x2): Ditto.
(vlmul_ext_x4): Ditto.
(vlmul_ext_x8): Ditto.
(vlmul_ext_x16): Ditto.
(vlmul_ext_x32): Ditto.
(vlmul_ext_x64): Ditto.
* config/riscv/riscv-vector-builtins.h (DEF_RVV_BASE_TYPE): New def.
(struct function_type_info): New function.
(struct rvv_arg_type_info): Ditto.
(class function_checker): New class.
(rvv_arg_type_info::get_scalar_type): New function.
(rvv_arg_type_info::get_vector_type): Ditto.
(function_expander::ret_mode): New function.
(function_checker::arg_mode): Ditto.
(function_checker::ret_mode): Ditto.
* config/riscv/t-riscv: Add generator.
* config/riscv/vector-iterators.md: New iterators.
* config/riscv/vector.md (vundefined<mode>): New pattern.
(@vundefined<mode>): Ditto.
(@vreinterpret<mode>): Ditto.
(@vlmul_extx2<mode>): Ditto.
(@vlmul_extx4<mode>): Ditto.
(@vlmul_extx8<mode>): Ditto.
(@vlmul_extx16<mode>): Ditto.
(@vlmul_extx32<mode>): Ditto.
(@vlmul_extx64<mode>): Ditto.
(*vlmul_extx2<mode>): Ditto.
(*vlmul_extx4<mode>): Ditto.
(*vlmul_extx8<mode>): Ditto.
(*vlmul_extx16<mode>): Ditto.
(*vlmul_extx32<mode>): Ditto.
(*vlmul_extx64<mode>): Ditto.
* config/riscv/genrvv-type-indexer.cc: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vlmul_v.c: New test.
Co-authored-by: kito-cheng <kito.cheng@sifive.com>
|
|
gcc/ChangeLog:
* config/riscv/constraints.md (Wb1): New constraint.
* config/riscv/predicates.md
(vector_least_significant_set_mask_operand): New predicate.
(vector_broadcast_mask_operand): Ditto.
* config/riscv/riscv-protos.h (enum vlmul_type): Adjust.
(gen_scalar_move_mask): New function.
* config/riscv/riscv-v.cc (gen_scalar_move_mask): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class vmv): New class.
(class vmv_s): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vmv_x): Ditto.
(vmv_s): Ditto.
(vfmv_f): Ditto.
(vfmv_s): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct scalar_move_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (function_expander::mask_mode): Ditto.
(function_expander::use_exact_insn): New function.
(function_expander::use_contiguous_load_insn): New function.
(function_expander::use_contiguous_store_insn): New function.
(function_expander::use_ternop_insn): New function.
(function_expander::use_widen_ternop_insn): New function.
(function_expander::use_scalar_move_insn): New function.
* config/riscv/riscv-vector-builtins.def (s): New operand suffix.
* config/riscv/riscv-vector-builtins.h
(function_expander::add_scalar_move_mask_operand): New class.
* config/riscv/riscv-vsetvl.cc (ignore_vlmul_insn_p): New function.
(scalar_move_insn_p): Ditto.
(has_vsetvl_killed_avl_p): Ditto.
(anticipatable_occurrence_p): Ditto.
(insert_vsetvl): Ditto.
(get_vl_vtype_info): Ditto.
(calculate_sew): Ditto.
(calculate_vlmul): Ditto.
(incompatible_avl_p): Ditto.
(different_sew_p): Ditto.
(different_lmul_p): Ditto.
(different_ratio_p): Ditto.
(different_tail_policy_p): Ditto.
(different_mask_policy_p): Ditto.
(possible_zero_avl_p): Ditto.
(first_ratio_invalid_for_second_sew_p): Ditto.
(first_ratio_invalid_for_second_lmul_p): Ditto.
(second_ratio_invalid_for_first_sew_p): Ditto.
(second_ratio_invalid_for_first_lmul_p): Ditto.
(second_sew_less_than_first_sew_p): Ditto.
(first_sew_less_than_second_sew_p): Ditto.
(compare_lmul): Ditto.
(second_lmul_less_than_first_lmul_p): Ditto.
(first_lmul_less_than_second_lmul_p): Ditto.
(first_ratio_less_than_second_ratio_p): Ditto.
(second_ratio_less_than_first_ratio_p): Ditto.
(DEF_INCOMPATIBLE_COND): Ditto.
(greatest_sew): Ditto.
(first_sew): Ditto.
(second_sew): Ditto.
(first_vlmul): Ditto.
(second_vlmul): Ditto.
(first_ratio): Ditto.
(second_ratio): Ditto.
(vlmul_for_first_sew_second_ratio): Ditto.
(ratio_for_second_sew_first_vlmul): Ditto.
(DEF_SEW_LMUL_FUSE_RULE): Ditto.
(always_unavailable): Ditto.
(avl_unavailable_p): Ditto.
(sew_unavailable_p): Ditto.
(lmul_unavailable_p): Ditto.
(ge_sew_unavailable_p): Ditto.
(ge_sew_lmul_unavailable_p): Ditto.
(ge_sew_ratio_unavailable_p): Ditto.
(DEF_UNAVAILABLE_COND): Ditto.
(same_sew_lmul_demand_p): Ditto.
(propagate_avl_across_demands_p): Ditto.
(reg_available_p): Ditto.
(avl_info::has_non_zero_avl): Ditto.
(vl_vtype_info::has_non_zero_avl): Ditto.
(vector_insn_info::operator>=): Refactor.
(vector_insn_info::parse_insn): Adjust for scalar move.
(vector_insn_info::demand_vl_vtype): Remove.
(vector_insn_info::compatible_p): New function.
(vector_insn_info::compatible_avl_p): Ditto.
(vector_insn_info::compatible_vtype_p): Ditto.
(vector_insn_info::available_p): Ditto.
(vector_insn_info::merge): Ditto.
(vector_insn_info::fuse_avl): Ditto.
(vector_insn_info::fuse_sew_lmul): Ditto.
(vector_insn_info::fuse_tail_policy): Ditto.
(vector_insn_info::fuse_mask_policy): Ditto.
(vector_insn_info::dump): Ditto.
(vector_infos_manager::release): Ditto.
(pass_vsetvl::compute_local_backward_infos): Adjust for scalar move support.
(pass_vsetvl::get_backward_fusion_type): Adjust for scalar move support.
(pass_vsetvl::hard_empty_block_p): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::refine_vsetvls): Ditto.
(pass_vsetvl::cleanup_vsetvls): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
(pass_vsetvl::propagate_avl): Ditto.
* config/riscv/riscv-vsetvl.h (enum demand_status): New class.
(struct demands_pair): Ditto.
(struct demands_cond): Ditto.
(struct demands_fuse_rule): Ditto.
* config/riscv/vector-iterators.md: New iterator.
* config/riscv/vector.md (@pred_broadcast<mode>): New pattern.
(*pred_broadcast<mode>): Ditto.
(*pred_broadcast<mode>_extended_scalar): Ditto.
(@pred_extract_first<mode>): Ditto.
(*pred_extract_first<mode>): Ditto.
(@pred_extract_first_trunc<mode>): Ditto.
* config/riscv/riscv-vsetvl.def: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-10.c: Adjust test.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-11.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-12.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-15.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-18.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-9.c: Ditto.
|
|
gcc/ChangeLog:
* config/riscv/predicates.md: Refine codes.
* config/riscv/riscv-protos.h (RVV_VUNDEF): New macro.
* config/riscv/riscv-v.cc: Refine codes.
* config/riscv/riscv-vector-builtins-bases.cc (enum ternop_type): New
enum.
(class imac): New class.
(enum widen_ternop_type): New enum.
(class iwmac): New class.
(BASE): New class.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vmacc): Ditto.
(vnmsac): Ditto.
(vmadd): Ditto.
(vnmsub): Ditto.
(vwmacc): Ditto.
(vwmaccu): Ditto.
(vwmaccsu): Ditto.
(vwmaccus): Ditto.
* config/riscv/riscv-vector-builtins.cc
(function_builder::apply_predication): Adjust for multiply-add support.
(function_expander::add_vundef_operand): Refine codes.
(function_expander::use_ternop_insn): New function.
(function_expander::use_widen_ternop_insn): Ditto.
* config/riscv/riscv-vector-builtins.h: New function.
* config/riscv/vector.md (@pred_mul_<optab><mode>): New pattern.
(pred_mul_<optab><mode>_undef_merge): Ditto.
(*pred_<madd_nmsub><mode>): Ditto.
(*pred_<macc_nmsac><mode>): Ditto.
(*pred_mul_<optab><mode>): Ditto.
(@pred_mul_<optab><mode>_scalar): Ditto.
(*pred_mul_<optab><mode>_undef_merge_scalar): Ditto.
(*pred_<madd_nmsub><mode>_scalar): Ditto.
(*pred_<macc_nmsac><mode>_scalar): Ditto.
(*pred_mul_<optab><mode>_scalar): Ditto.
(*pred_mul_<optab><mode>_undef_merge_extended_scalar): Ditto.
(*pred_<madd_nmsub><mode>_extended_scalar): Ditto.
(*pred_<macc_nmsac><mode>_extended_scalar): Ditto.
(*pred_mul_<optab><mode>_extended_scalar): Ditto.
(@pred_widen_mul_plus<su><mode>): Ditto.
(@pred_widen_mul_plus<su><mode>_scalar): Ditto.
(@pred_widen_mul_plussu<mode>): Ditto.
(@pred_widen_mul_plussu<mode>_scalar): Ditto.
(@pred_widen_mul_plusus<mode>_scalar): Ditto.
|
|
gcc/ChangeLog:
* config/riscv/predicates.md (vector_mask_operand): Refine the codes.
(vector_all_trues_mask_operand): New predicate.
(vector_undef_operand): New predicate.
(ltge_operator): New predicate.
(comparison_except_ltge_operator): New predicate.
(comparison_except_eqge_operator): New predicate.
(ge_operator): New predicate.
* config/riscv/riscv-v.cc (has_vi_variant_p): Add compare support.
* config/riscv/riscv-vector-builtins-bases.cc (class icmp): New class.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vmseq): Ditto.
(vmsne): Ditto.
(vmslt): Ditto.
(vmsgt): Ditto.
(vmsle): Ditto.
(vmsge): Ditto.
(vmsltu): Ditto.
(vmsgtu): Ditto.
(vmsleu): Ditto.
(vmsgeu): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc
(struct return_mask_def): Adjust for compare support.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_compare_insn): New function.
* config/riscv/riscv-vector-builtins.h
(function_expander::add_integer_operand): Ditto.
* config/riscv/riscv.cc (riscv_print_operand): Add compare support.
* config/riscv/riscv.md: Add vector min/max attributes.
* config/riscv/vector-iterators.md (xnor): New iterator.
* config/riscv/vector.md (@pred_cmp<mode>): New pattern.
(*pred_cmp<mode>): Ditto.
(*pred_cmp<mode>_narrow): Ditto.
(@pred_ltge<mode>): Ditto.
(*pred_ltge<mode>): Ditto.
(*pred_ltge<mode>_narrow): Ditto.
(@pred_cmp<mode>_scalar): Ditto.
(*pred_cmp<mode>_scalar): Ditto.
(*pred_cmp<mode>_scalar_narrow): Ditto.
(@pred_eqne<mode>_scalar): Ditto.
(*pred_eqne<mode>_scalar): Ditto.
(*pred_eqne<mode>_scalar_narrow): Ditto.
(*pred_cmp<mode>_extended_scalar): Ditto.
(*pred_cmp<mode>_extended_scalar_narrow): Ditto.
(*pred_eqne<mode>_extended_scalar): Ditto.
(*pred_eqne<mode>_extended_scalar_narrow): Ditto.
(@pred_ge<mode>_scalar): Ditto.
(@pred_<optab><mode>): Ditto.
(@pred_n<optab><mode>): Ditto.
(@pred_<optab>n<mode>): Ditto.
(@pred_not<mode>): Ditto.
|
|
gcc/ChangeLog:
* config/riscv/constraints.md (Wbr): Remove unused constraint.
* config/riscv/predicates.md: Fix move operand predicate.
* config/riscv/riscv-vector-builtins-bases.cc (class vnshift): New class.
(class vncvt_x): Ditto.
(class vmerge): Ditto.
(class vmv_v): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vsra): Ditto.
(vsrl): Ditto.
(vnsrl): Ditto.
(vnsra): Ditto.
(vncvt_x): Ditto.
(vmerge): Ditto.
(vmv_v): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct narrow_alu_def): Ditto.
(struct move_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_WEXTI_OPS): New variable.
(DEF_RVV_WEXTU_OPS): Ditto
* config/riscv/riscv-vector-builtins.def (x_x_w): Fix type for suffix.
(v_v): Ditto.
(v_x): Ditto.
(x_w): Ditto.
(x): Ditto.
* config/riscv/riscv.cc (riscv_print_operand): Refine ASM printting rule.
* config/riscv/vector-iterators.md (nmsac):New iterator.
(nmsub): New iterator.
* config/riscv/vector.md (@pred_merge<mode>): New pattern.
(@pred_merge<mode>_scalar): New pattern.
(*pred_merge<mode>_scalar): New pattern.
(*pred_merge<mode>_extended_scalar): New pattern.
(@pred_narrow_<optab><mode>): New pattern.
(@pred_narrow_<optab><mode>_scalar): New pattern.
(@pred_trunc<mode>): New pattern.
|
|
gcc/ChangeLog:
* config/riscv/constraints.md (Wdm): Adjust constraint.
(Wbr): New constraint.
* config/riscv/predicates.md (reg_or_int_operand): New predicate.
* config/riscv/riscv-protos.h (emit_pred_op): Remove function.
(emit_vlmax_op): New function.
(emit_nonvlmax_op): Ditto.
(simm32_p): Ditto.
(neg_simm5_p): Ditto.
(has_vi_variant_p): Ditto.
* config/riscv/riscv-v.cc (emit_pred_op): Adjust function.
(emit_vlmax_op): New function.
(emit_nonvlmax_op): Ditto.
(expand_const_vector): Adjust function.
(legitimize_move): Ditto.
(simm32_p): New function.
(simm5_p): Ditto.
(neg_simm5_p): Ditto.
(has_vi_variant_p): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class vrsub): New class.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vmin): Remove
unsigned cases.
(vmax): Ditto.
(vminu): Remove signed cases.
(vmaxu): Ditto.
(vdiv): Remove unsigned cases.
(vrem): Ditto.
(vdivu): Remove signed cases.
(vremu): Ditto.
(vadd): Adjust.
(vsub): Ditto.
(vrsub): New class.
(vand): Adjust.
(vor): Ditto.
(vxor): Ditto.
(vmul): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_U_OPS): New macro.
* config/riscv/riscv.h: change VL/VTYPE as fixed reg.
* config/riscv/vector-iterators.md: New iterators.
* config/riscv/vector.md (@pred_broadcast<mode>): Adjust pattern for vx
support.
(@pred_<optab><mode>_scalar): New pattern.
(@pred_sub<mode>_reverse_scalar): Ditto.
(*pred_<optab><mode>_scalar): Ditto.
(*pred_<optab><mode>_extended_scalar): Ditto.
(*pred_sub<mode>_reverse_scalar): Ditto.
(*pred_sub<mode>_extended_reverse_scalar): Ditto.
|
|
gcc/ChangeLog:
* config/riscv/predicates.md (pmode_reg_or_uimm5_operand): New predicate.
* config/riscv/riscv-vector-builtins-bases.cc: New class.
* config/riscv/riscv-vector-builtins-functions.def (vsll): Ditto.
(vsra): Ditto.
(vsrl): Ditto.
* config/riscv/riscv-vector-builtins.cc: Ditto.
* config/riscv/vector.md (@pred_<optab><mode>_scalar): New pattern.
|
|
Add vector intrinsic for integer binary operation, but only vector to
vector form.
gcc/ChangeLog:
* config/riscv/constraints.md (vj): New.
(vk): Ditto
* config/riscv/iterators.md: Add more opcode.
* config/riscv/predicates.md (vector_arith_operand): New.
(vector_neg_arith_operand): New.
(vector_shift_operand): New.
* config/riscv/riscv-vector-builtins-bases.cc (class binop): New.
* config/riscv/riscv-vector-builtins-bases.h: (vadd): New.
(vsub): Ditto.
(vand): Ditto.
(vor): Ditto.
(vxor): Ditto.
(vsll): Ditto.
(vsra): Ditto.
(vsrl): Ditto.
(vmin): Ditto.
(vmax): Ditto.
(vminu): Ditto.
(vmaxu): Ditto.
(vmul): Ditto.
(vdiv): Ditto.
(vrem): Ditto.
(vdivu): Ditto.
(vremu): Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vadd): New.
(vsub): Ditto.
(vand): Ditto.
(vor): Ditto.
(vxor): Ditto.
(vsll): Ditto.
(vsra): Ditto.
(vsrl): Ditto.
(vmin): Ditto.
(vmax): Ditto.
(vminu): Ditto.
(vmaxu): Ditto.
(vmul): Ditto.
(vdiv): Ditto.
(vrem): Ditto.
(vdivu): Ditto.
(vremu): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct binop_def): New.
* config/riscv/riscv-vector-builtins-shapes.h (binop): New.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_I_OPS): New.
(DEF_RVV_U_OPS): New.
(rvv_arg_type_info::get_base_vector_type): Handle
RVV_BASE_shift_vector.
(rvv_arg_type_info::get_tree_type): Ditto.
* config/riscv/riscv-vector-builtins.h (enum rvv_base_type): Add
RVV_BASE_shift_vector.
* config/riscv/riscv.cc (riscv_print_operand): Handle 'V'.
* config/riscv/vector-iterators.md: Handle more opcode.
* config/riscv/vector.md (@pred_<optab><mode>): New.
|
|
gcc/ChangeLog:
* config/riscv/predicates.md (pmode_reg_or_0_operand): New predicate.
* config/riscv/riscv-vector-builtins-bases.cc (class loadstore):
Support vlse/vsse.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vlse): New class.
(vsse): New class.
* config/riscv/riscv-vector-builtins.cc
(function_expander::use_contiguous_load_insn): Support vlse/vsse.
* config/riscv/vector.md (@pred_strided_load<mode>): New md pattern.
(@pred_strided_store<mode>): Ditto.
|
|
|
|
gcc/ChangeLog:
* config/riscv/constraints.md (Wdm): New constraint.
* config/riscv/predicates.md (direct_broadcast_operand): New predicate.
* config/riscv/riscv-protos.h (RVV_VLMAX): New macro.
(emit_pred_op): Refine function.
* config/riscv/riscv-selftests.cc (run_const_vector_selftests): New function.
(run_broadcast_selftests): Ditto.
(BROADCAST_TEST): New tests.
(riscv_run_selftests): More tests.
* config/riscv/riscv-v.cc (emit_pred_move): Refine function.
(emit_vlmax_vsetvl): Ditto.
(emit_pred_op): Ditto.
(expand_const_vector): New function.
(legitimize_move): Add constant vector support.
* config/riscv/riscv.cc (riscv_print_operand): New asm print rule for const vector.
* config/riscv/riscv.h (X0_REGNUM): New macro.
* config/riscv/vector-iterators.md: New attribute.
* config/riscv/vector.md (vec_duplicate<mode>): New pattern.
(@pred_broadcast<mode>): New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/dup-1.c: New test.
* gcc.target/riscv/rvv/base/dup-2.c: New test.
|
|
Use Zbs when generating a sequence for
"if ((a & twobits) == singlebit) ..."
that can be expressed as
bexti + bexti + andn.
gcc/ChangeLog:
* config/riscv/bitmanip.md
(*branch<X:mode>_mask_twobits_equals_singlebit):
Handle "if ((a & T) == C)" using Zbs, when T has 2 bits set and C
has one of these tow bits set.
* config/riscv/predicates.md (const_twobits_not_arith_operand):
New predicate.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbs-if_then_else-01.c: New test.
|