riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Ju-Zhe Zhong <juzhe.zhong@rivai.ai>	2023-07-12 21:17:39 +0800
committer	Pan Li <pan2.li@intel.com>	2023-07-12 22:26:33 +0800
commit	0d4dd7e07a879d6c07a33edb2799710faa95651e (patch)
tree	caabf9a182c552c07f985008527476861d512152 /gcc/tree-ssa-loop-ch.cc
parent	13c3e29d47e359b2f05ea98d61710fc162ba6d31 (diff)
download	gcc-0d4dd7e07a879d6c07a33edb2799710faa95651e.zip gcc-0d4dd7e07a879d6c07a33edb2799710faa95651e.tar.gz gcc-0d4dd7e07a879d6c07a33edb2799710faa95651e.tar.bz2

VECT: Apply COND_LEN_* into vectorizable_operation

Hi, Richard and Richi. As we disscussed before, COND_LEN_* patterns were added for multiple situations. This patch apply CON_LEN_* for the following situation: Support for the situation that in "vectorizable_operation": /* If operating on inactive elements could generate spurious traps, we need to restrict the operation to active lanes. Note that this specifically doesn't apply to unhoisted invariants, since they operate on the same value for every lane. Similarly, if this operation is part of a reduction, a fully-masked loop should only change the active lanes of the reduction chain, keeping the inactive lanes as-is. */ bool mask_out_inactive = ((!is_invariant && gimple_could_trap_p (stmt)) || reduc_idx >= 0); For mask_out_inactive is true with length loop control. So, we can these 2 following cases: 1. Integer division: #define TEST_TYPE(TYPE) \ __attribute__((noipa)) \ void vrem_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n) \ { \ for (int i = 0; i < n; i++) \ dst[i] = a[i] % b[i]; \ } #define TEST_ALL() \ TEST_TYPE(int8_t) \ TEST_ALL() With this patch: _61 = .SELECT_VL (ivtmp_59, POLY_INT_CST [4, 4]); ivtmp_45 = _61 * 4; vect__4.8_48 = .LEN_MASK_LOAD (vectp_a.6_46, 32B, _61, 0, { -1, ... }); vect__6.11_52 = .LEN_MASK_LOAD (vectp_b.9_50, 32B, _61, 0, { -1, ... }); vect__8.12_53 = .COND_LEN_ADD ({ -1, ... }, vect__4.8_48, vect__6.11_52, vect__4.8_48, _61, 0); .LEN_MASK_STORE (vectp_dst.13_55, 32B, _61, 0, { -1, ... }, vect__8.12_53); 2. Floating-point arithmetic **WITHOUT** -ffast-math #define TEST_TYPE(TYPE) \ __attribute__((noipa)) \ void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n) \ { \ for (int i = 0; i < n; i++) \ dst[i] = a[i] + b[i]; \ } #define TEST_ALL() \ TEST_TYPE(float) \ TEST_ALL() With this patch: _61 = .SELECT_VL (ivtmp_59, POLY_INT_CST [4, 4]); ivtmp_45 = _61 * 4; vect__4.8_48 = .LEN_MASK_LOAD (vectp_a.6_46, 32B, _61, 0, { -1, ... }); vect__6.11_52 = .LEN_MASK_LOAD (vectp_b.9_50, 32B, _61, 0, { -1, ... }); vect__8.12_53 = .COND_LEN_ADD ({ -1, ... }, vect__4.8_48, vect__6.11_52, vect__4.8_48, _61, 0); .LEN_MASK_STORE (vectp_dst.13_55, 32B, _61, 0, { -1, ... }, vect__8.12_53); With this patch, we can make sure operations won't trap for elements that "mask_out_inactive". gcc/ChangeLog: * internal-fn.cc (FOR_EACH_CODE_MAPPING): Adapt for COND_LEN_* support. (CASE): Ditto. (get_conditional_len_internal_fn): New function. * internal-fn.h (get_conditional_len_internal_fn): Ditto. * tree-vect-stmts.cc (vectorizable_operation): Adapt for COND_LEN_* support.

Diffstat (limited to 'gcc/tree-ssa-loop-ch.cc')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: