diff options
author | Yangyu Chen <chenyangyu@isrc.iscas.ac.cn> | 2023-10-27 08:39:26 -0600 |
---|---|---|
committer | Jeff Law <jlaw@ventanamicro.com> | 2023-10-27 08:39:26 -0600 |
commit | 7bcdb777e6a0d1a0159f25616c5d8e35e7cb5fb6 (patch) | |
tree | 607713ec464a9f316854412363bff1346cab4cea /gcc/config/riscv/riscv-vector-builtins-shapes.h | |
parent | 9c032218107675291d05be28f8c08a32e3a17b95 (diff) | |
download | gcc-7bcdb777e6a0d1a0159f25616c5d8e35e7cb5fb6.zip gcc-7bcdb777e6a0d1a0159f25616c5d8e35e7cb5fb6.tar.gz gcc-7bcdb777e6a0d1a0159f25616c5d8e35e7cb5fb6.tar.bz2 |
[PATCH] RISC-V: Fix wrong tune parameters on int_div
This patch fixes an issue with the cost on "int_div" in various RISC-V
tune parameters including those for Rocket, SiFive U7 series, and T-Head
C906. This incorrect cost value interferes with the optimization process.
For example, it prevents the optimization of division by a constant to a
more efficient method known as Barrett reduction. This lack of
optimization negatively affects the performance of these systems.
The integer div cost of the Rocket and SiFive U7 is taken from the
Rocket-Chip Divider source code[1] with BigCore configuration[2]. It shows
the divUnroll unchanged which is 1 by default. Thus, the maximum int_div
cycles should be the dataWidth + 1, which is 33 for 32-bit and 65 for
64-bit.
As for C906, the divider takes 2 cycle to start[3], and it produce 2-bit
result each cycle[4]. Thus, the maximum int_div cycles should be the
dataWidth / 2 + 2, which is 18 for 32-bit and 34 for 64-bit.
I also test the performance on VisionFive2 which has Qual-Core Sifive U74.
I write a simple C program to do 1e8 times div by constant 6 in int32. The
result shows it takes 1.998s using div, and 0.420s using barrett reduction
to replace div with mul, which is 4.75x faster.
[1] https://github.com/chipsalliance/rocket-chip/blob/v1.6/src/main/scala/rocket/Multiplier.scala#L40
[2] https://github.com/chipsalliance/rocket-chip/blob/v1.6/src/main/scala/subsystem/Configs.scala#L97
[3] https://github.com/T-head-Semi/openc906/blob/af5614d72de7e5a4b8609c427d2e20af1deb21c4/C906_RTL_FACTORY/gen_rtl/iu/rtl/aq_iu_div.v#L267
[4] https://github.com/T-head-Semi/openc906/blob/af5614d72de7e5a4b8609c427d2e20af1deb21c4/C906_RTL_FACTORY/gen_rtl/iu/rtl/aq_iu_div_shift2_kernel.v#L93
gcc/ChangeLog:
* config/riscv/riscv.cc (rocket_tune_info): Fix int_div cost.
(sifive_7_tune_info, thead_c906_tune_info): Likewise.
Diffstat (limited to 'gcc/config/riscv/riscv-vector-builtins-shapes.h')
0 files changed, 0 insertions, 0 deletions