diff options
author | Pan Li <pan2.li@intel.com> | 2023-08-24 12:29:36 +0800 |
---|---|---|
committer | Pan Li <pan2.li@intel.com> | 2023-08-31 21:25:21 +0800 |
commit | 3e37e8231849ded7e214042f60f59fdcec75d7d3 (patch) | |
tree | 8f4beb59c5b54d7753a247429ca843be22eb1269 /gcc/objc | |
parent | e3ece7684b02c47d2b259899cf8009d6bdcccaf3 (diff) | |
download | gcc-3e37e8231849ded7e214042f60f59fdcec75d7d3.zip gcc-3e37e8231849ded7e214042f60f59fdcec75d7d3.tar.gz gcc-3e37e8231849ded7e214042f60f59fdcec75d7d3.tar.bz2 |
RISC-V: Support rounding mode for VFMADD/VFMACC autovec
There will be a case like below for intrinsic and autovec combination
vfadd RTZ <- intrinisc static rounding
vfmadd <- autovec/autovec-opt
The autovec generated vfmadd should take DYN mode, and the
frm must be restored before the vfmadd insn. This patch
would like to fix this issue by:
* Add the frm operand to the vfmadd/vfmacc autovec/autovec-opt pattern.
* Set the frm_mode attr to DYN.
Thus, the frm flow when combine autovec and intrinsic should be.
+------------
| frrm a5
| ...
| fsrmi 4
| vfadd <- intrinsic static rounding.
| ...
| fsrm a5
| vfmadd <- autovec/autovec-opt
| ...
+------------
However, we leverage unspec instead of use to consume the FRM register
because there are some restrictions from the combine pass. Some code
path of try_combine may require the XVECLEN(pat, 0) == 2 for the
recog_for_combine, and add new use will make the XVECLEN(pat, 0) == 3
and result in the vfwmacc optimization failure. For example, in the
test widen-complicate-5.c and widen-8.c
Finally, there will be other fma cases and they will be covered in
the underlying patches.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-By: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
gcc/ChangeLog:
* config/riscv/autovec-opt.md: Add FRM_REGNUM to vfmadd/vfmacc.
* config/riscv/autovec.md: Ditto.
* config/riscv/vector-iterators.md: Add UNSPEC_VFFMA.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/float-point-frm-autovec-1.c: New test.
Diffstat (limited to 'gcc/objc')
0 files changed, 0 insertions, 0 deletions