Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch is to handle the differences in instruction generation
between Vector and XTheadVector. In this version, we only support
partial xtheadvector instructions that leverage directly from current
RVV1.0 with simple adding "th." prefix. For different name xtheadvector
instructions but share same patterns as RVV1.0 instructions, we will
use ASM targethook to rewrite the whole string of the instructions in
the following patches.
For some vector patterns that cannot be avoided, we use
"!TARGET_XTHEADVECTOR" to disable them in vector.md in order
not to generate instructions that xtheadvector does not support,
like vmv1r.
gcc/ChangeLog:
* config.gcc: Add files for XTheadVector intrinsics.
* config/riscv/autovec.md: Guard XTheadVector.
* config/riscv/predicates.md: Disable immediate vl
for XTheadVector.
* config/riscv/riscv-c.cc (riscv_pragma_intrinsic):
Add pragma for XTheadVector.
* config/riscv/riscv-string.cc (riscv_expand_block_move):
Guard XTheadVector.
* config/riscv/riscv-v.cc (vls_mode_valid_p):
Avoid autovec.
* config/riscv/riscv-vector-builtins-bases.cc:
Do not normalize vsetvl instructions for XTheadVector.
* config/riscv/riscv-vector-builtins-shapes.cc (check_type):
New check type function.
(build_one): Adjust for XTheadVector.
* config/riscv/riscv-vector-switch.def (ENTRY):
Disable fractional mode for the XTheadVector extension.
(TUPLE_ENTRY): Likewise.
* config/riscv/riscv.cc (riscv_v_adjust_bytesize):
Guard XTheadVector.
(riscv_preferred_simd_mode): Likewsie.
(riscv_autovectorize_vector_modes): Likewise.
(riscv_vector_mode_supported_any_target_p): Likewise.
(TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P): Likewise.
* config/riscv/thead.cc (th_asm_output_opcode):
Rewrite vsetvl instructions.
* config/riscv/vector.md:
Include thead-vector.md and change fractional LMUL
into 1 for vbool.
* config/riscv/riscv_th_vector.h: New file.
* config/riscv/thead-vector.md: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pragma-1.c: Add XTheadVector.
* gcc.target/riscv/rvv/base/abi-1.c: Exclude XTheadVector.
* lib/target-supports.exp: Add target for XTheadVector.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com>
Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com>
Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
|
|
This patch adds vectorized strcmp and strncmp implementations and
tests. Similar to strlen, expansion is still guarded by
-minline-str(n)cmp.
gcc/ChangeLog:
PR target/112109
* config/riscv/riscv-protos.h (expand_strcmp): Declare.
* config/riscv/riscv-string.cc (riscv_expand_strcmp): Add
strategy handling and delegation to scalar and vector expanders.
(expand_strcmp): Vectorized implementation.
* config/riscv/riscv.md: Add TARGET_VECTOR to strcmp and strncmp
expander.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strcmp.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strncmp-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strncmp.c: New test.
|
|
This patch implements a vectorized strlen by re-using and slightly
adjusting the rawmemchr implementation. Rawmemchr returns the address
of the needle while strlen returns the difference between needle address
and start address.
As before, strlen expansion is guarded by -minline-strlen.
While testing with -minline-strlen I encountered a vsetvl problem in
memcpy-chk.c where we didn't insert a vsetvl at the proper spot (after
a setjmp). This needs to be fixed separately and I figured I'd post
this patch as-is.
gcc/ChangeLog:
PR target/112109
* config/riscv/riscv-protos.h (expand_rawmemchr): Add strlen
parameter.
* config/riscv/riscv-string.cc (riscv_expand_strlen): Call
rawmemchr.
(expand_rawmemchr): Add strlen handling.
* config/riscv/riscv.md: Add TARGET_VECTOR to strlen expander.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/builtin/strlen-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strlen.c: New test.
|
|
This fixes a bug in the rawmemchr implementation by incrementing the
source address by vl * element_size instead of just vl.
This is normally harmless as we will just scan the same region more than
once but, in combination with an older qemu version, will lead to
an execution failure in SPEC2017.
gcc/ChangeLog:
* config/riscv/riscv-string.cc (expand_rawmemchr): Increment
source address by vl * element_size.
|
|
In preparation for the vectorized strlen and strcmp support this NFC
patch unifies the stringop strategy handling a bit. The "auto"
strategy now is a combination of scalar and vector and an expander
should try the strategies in their preferred order.
For the block_move expander this patch does just that.
gcc/ChangeLog:
* config/riscv/riscv-opts.h (enum riscv_stringop_strategy_enum):
Rename...
(enum stringop_strategy_enum): ... to this.
* config/riscv/riscv-string.cc (riscv_expand_block_move): New
wrapper expander handling the strategies and delegation.
(riscv_expand_block_move_scalar): Rename function and make
static.
(expand_block_move): Remove strategy handling.
* config/riscv/riscv.md: Call expander wrapper.
* config/riscv/riscv.opt: Rename.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/cpymem-strategy-1.c: Change to
-mstringop-strategy.
* gcc.target/riscv/rvv/base/cpymem-strategy-2.c: Ditto.
* gcc.target/riscv/rvv/base/cpymem-strategy-3.c: Ditto.
* gcc.target/riscv/rvv/base/cpymem-strategy-4.c: Ditto.
* gcc.target/riscv/rvv/base/cpymem-strategy-5.c: Ditto.
|
|
The exact_div requires the exactly multiple of the divider.
Unfortunately, the condition will be broken when zve32f in
some cases. For example,
potential_ew is 8
BYTES_PER_RISCV_VECTOR * lmul1 is [4, 4]
This patch would like to ensure the precondition of exact_div
when get_vec_mode.
PR target/112743
gcc/ChangeLog:
* config/riscv/riscv-string.cc (expand_block_move): Add
precondition check for exact_div.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr112743-1.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112537
-mmemcpy-strategy=[auto|libcall|scalar|vector]
auto: Current status, use scalar or vector instructions.
libcall: Always use a library call.
scalar: Only use scalar instructions.
vector: Only use vector instructions.
PR target/112537
gcc/ChangeLog:
* config/riscv/riscv-opts.h (enum riscv_stringop_strategy_enum): Strategy enum.
* config/riscv/riscv-string.cc (riscv_expand_block_move): Disabled based on options.
(expand_block_move): Ditto.
* config/riscv/riscv.opt: Add -mmemcpy-strategy=.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/cpymem-strategy-1.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy-2.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy-3.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy-4.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy-5.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy.h: New test.
|
|
This patch adds a vectorized rawmemchr expander. It also moves the
vectorized expand_block_move to riscv-string.cc.
gcc/ChangeLog:
* config/riscv/autovec.md (rawmemchr<ANYI:mode>): New expander.
* config/riscv/riscv-protos.h (gen_no_side_effects_vsetvl_rtx):
Define.
(expand_rawmemchr): Define.
* config/riscv/riscv-v.cc (force_vector_length_operand): Remove
static.
(expand_block_move): Move from here...
* config/riscv/riscv-string.cc (expand_block_move): ...to here.
(expand_rawmemchr): Add vectorized expander.
* internal-fn.cc (expand_RAWMEMCHR): Fix typo.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-prof/peel-2.c: Add
-fno-tree-loop-distribute-patterns.
* gcc.dg/tree-ssa/ldist-rawmemchr-1.c: Add riscv.
* gcc.dg/tree-ssa/ldist-rawmemchr-2.c: Ditto.
* gcc.target/riscv/rvv/rvv.exp: Add builtin directory.
* gcc.target/riscv/rvv/autovec/builtin/rawmemchr-1.c: New test.
|
|
This just moves a few functions out of riscv.cc into riscv-string.cc in an
attempt to keep riscv.cc manageable. This was originally Christoph's code and
I'm just pushing it on his behalf.
Full disclosure: I built rv64gc after changing to verify everything still
builds. Given it was just lifting code from one place to another, I didn't run
the testsuite.
gcc/
* config/riscv/riscv-protos.h (emit_block_move): Remove redundant
prototype. Improve comment.
* config/riscv/riscv.cc (riscv_block_move_straight): Move from riscv.cc
into riscv-string.cc.
(riscv_adjust_block_mem, riscv_block_move_loop): Likewise.
(riscv_expand_block_move): Likewise.
* config/riscv/riscv-string.cc (riscv_block_move_straight): Add moved
function.
(riscv_adjust_block_mem, riscv_block_move_loop): Likewise.
(riscv_expand_block_move): Likewise.
|
|
This patch implements expansions for the cmpstrsi and cmpstrnsi
builtins for RV32/RV64 for xlen-aligned strings if Zbb or XTheadBb
instructions are available. The expansion basically emits a comparison
sequence which compares XLEN bits per step if possible.
This allows to inline calls to strcmp() and strncmp() if both strings
are xlen-aligned. For strncmp() the length parameter needs to be known.
The benefits over calls to libc are:
* no call/ret instructions
* no stack frame allocation
* no register saving/restoring
* no alignment tests
The inlining mechanism is gated by a new switches ('-minline-strcmp' and
'-minline-strncmp') and by the variable 'optimize_size'.
The amount of emitted unrolled loop iterations can be controlled by the
parameter '--param=riscv-strcmp-inline-limit=N', which defaults to 64.
The comparision sequence is inspired by the strcmp example
in the appendix of the Bitmanip specification (incl. the fast
result calculation in case the first word does not contain
a NULL byte). Additional inspiration comes from rs6000-string.c.
The emitted sequence is not triggering any readahead pagefault issues,
because only aligned strings are accessed by aligned xlen-loads.
This patch has been tested using the glibc string tests on QEMU:
* rv64gc_zbb/rv64gc_xtheadbb with riscv-strcmp-inline-limit=64
* rv64gc_zbb/rv64gc_xtheadbb with riscv-strcmp-inline-limit=8
* rv32gc_zbb/rv32gc_xtheadbb with riscv-strcmp-inline-limit=64
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
gcc/ChangeLog:
* config/riscv/bitmanip.md (*<optab>_not<mode>): Export INSN name.
(<optab>_not<mode>3): Likewise.
* config/riscv/riscv-protos.h (riscv_expand_strcmp): New
prototype.
* config/riscv/riscv-string.cc (GEN_EMIT_HELPER3): New helper
macros.
(GEN_EMIT_HELPER2): Likewise.
(emit_strcmp_scalar_compare_byte): New function.
(emit_strcmp_scalar_compare_subword): Likewise.
(emit_strcmp_scalar_compare_word): Likewise.
(emit_strcmp_scalar_load_and_compare): Likewise.
(emit_strcmp_scalar_call_to_libc): Likewise.
(emit_strcmp_scalar_result_calculation_nonul): Likewise.
(emit_strcmp_scalar_result_calculation): Likewise.
(riscv_expand_strcmp_scalar): Likewise.
(riscv_expand_strcmp): Likewise.
* config/riscv/riscv.md (*slt<u>_<X:mode><GPR:mode>): Export
INSN name.
(@slt<u>_<X:mode><GPR:mode>3): Likewise.
(cmpstrnsi): Invoke expansion function for str(n)cmp.
(cmpstrsi): Likewise.
* config/riscv/riscv.opt: Add new parameter
'-mstring-compare-inline-limit'.
* doc/invoke.texi: Document new parameter
'-mstring-compare-inline-limit'.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/xtheadbb-strcmp.c: New test.
* gcc.target/riscv/zbb-strcmp-disabled-2.c: New test.
* gcc.target/riscv/zbb-strcmp-disabled.c: New test.
* gcc.target/riscv/zbb-strcmp-unaligned.c: New test.
* gcc.target/riscv/zbb-strcmp.c: New test.
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
|
|
This patch implements the expansion of the strlen builtin for RV32/RV64
for xlen-aligned aligned strings if Zbb or XTheadBb instructions are available.
The inserted sequences are:
rv32gc_zbb (RV64 is similar):
add a3,a0,4
li a4,-1
.L1: lw a5,0(a0)
add a0,a0,4
orc.b a5,a5
beq a5,a4,.L1
not a5,a5
ctz a5,a5
srl a5,a5,0x3
add a0,a0,a5
sub a0,a0,a3
rv64gc_xtheadbb (RV32 is similar):
add a4,a0,8
.L2: ld a5,0(a0)
add a0,a0,8
th.tstnbz a5,a5
beqz a5,.L2
th.rev a5,a5
th.ff1 a5,a5
srl a5,a5,0x3
add a0,a0,a5
sub a0,a0,a4
This allows to inline calls to strlen(), with optimized code for
xlen-aligned strings, resulting in the following benefits over
a call to libc:
* no call/ret instructions
* no stack frame allocation
* no register saving/restoring
* no alignment test
The inlining mechanism is gated by a new switch ('-minline-strlen')
and by the variable 'optimize_size'.
Tested using the glibc string tests.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
gcc/ChangeLog:
* config.gcc: Add new object riscv-string.o.
riscv-string.cc.
* config/riscv/riscv-protos.h (riscv_expand_strlen):
New function.
* config/riscv/riscv.md (strlen<mode>): New expand INSN.
* config/riscv/riscv.opt: New flag 'minline-strlen'.
* config/riscv/t-riscv: Add new object riscv-string.o.
* config/riscv/thead.md (th_rev<mode>2): Export INSN name.
(th_rev<mode>2): Likewise.
(th_tstnbz<mode>2): New INSN.
* doc/invoke.texi: Document '-minline-strlen'.
* emit-rtl.cc (emit_likely_jump_insn): New helper function.
(emit_unlikely_jump_insn): Likewise.
* rtl.h (emit_likely_jump_insn): New prototype.
(emit_unlikely_jump_insn): Likewise.
* config/riscv/riscv-string.cc: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/xtheadbb-strlen-unaligned.c: New test.
* gcc.target/riscv/xtheadbb-strlen.c: New test.
* gcc.target/riscv/zbb-strlen-disabled-2.c: New test.
* gcc.target/riscv/zbb-strlen-disabled.c: New test.
* gcc.target/riscv/zbb-strlen-unaligned.c: New test.
* gcc.target/riscv/zbb-strlen.c: New test.
|