diff options
author | Michael Collison <collison@rivosinc.com> | 2023-05-06 12:37:50 -0600 |
---|---|---|
committer | Jeff Law <jlaw@ventanamicro> | 2023-05-06 12:37:50 -0600 |
commit | 730909fa858bd691095bc23655077aa13b7941a9 (patch) | |
tree | 092dd666dadaf333295213729c2e003821b616bd | |
parent | 9217e0dde1b7dbcff456d513334496404e626437 (diff) | |
download | gcc-730909fa858bd691095bc23655077aa13b7941a9.zip gcc-730909fa858bd691095bc23655077aa13b7941a9.tar.gz gcc-730909fa858bd691095bc23655077aa13b7941a9.tar.bz2 |
RISC-V: autovec: Verify that GET_MODE_NUNITS is a multiple of 2.
While working on autovectorizing for the RISCV port I encountered an issue
where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
where GET_MODE_NUNITS is equal to one.
Tested on RISCV and x86_64-linux-gnu. Okay?
gcc/
* tree-vect-slp.cc (can_duplicate_and_interleave_p):
Check that GET_MODE_NUNITS is a multiple of 2.
-rw-r--r-- | gcc/tree-vect-slp.cc | 7 |
1 files changed, 5 insertions, 2 deletions
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index b299e20..3b7a217 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count, (GET_MODE_BITSIZE (int_mode), 1); tree vector_type = get_vectype_for_scalar_type (vinfo, int_type, count); + poly_int64 half_nelts; if (vector_type && VECTOR_MODE_P (TYPE_MODE (vector_type)) && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)), - GET_MODE_SIZE (base_vector_mode))) + GET_MODE_SIZE (base_vector_mode)) + && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)), + 2, &half_nelts)) { /* Try fusing consecutive sequences of COUNT / NVECTORS elements together into elements of type INT_TYPE and using the result @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count, poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type)); vec_perm_builder sel1 (nelts, 2, 3); vec_perm_builder sel2 (nelts, 2, 3); - poly_int64 half_nelts = exact_div (nelts, 2); + for (unsigned int i = 0; i < 3; ++i) { sel1.quick_push (i); |