diff options
author | Juzhe-Zhong <juzhe.zhong@rivai.ai> | 2023-08-21 09:04:53 +0800 |
---|---|---|
committer | Lehua Ding <lehua.ding@rivai.ai> | 2023-08-21 17:18:14 +0800 |
commit | d5dfba19aee783a6ba90fdba1993d576c7ec310b (patch) | |
tree | bb57906f94af9990954c78c500a76c02855a7b04 /gcc | |
parent | 966b0a96523fb7adbf498ac71df5e033c70dc546 (diff) | |
download | gcc-d5dfba19aee783a6ba90fdba1993d576c7ec310b.zip gcc-d5dfba19aee783a6ba90fdba1993d576c7ec310b.tar.gz gcc-d5dfba19aee783a6ba90fdba1993d576c7ec310b.tar.bz2 |
LCM: Export 2 helpful functions as global for VSETVL PASS use in RISC-V backend
This patch exports 'compute_antinout_edge' and 'compute_earliest' as global scope
which is going to be used in VSETVL PASS of RISC-V backend.
The demand fusion is the fusion of VSETVL information to emit VSETVL which dominate and pre-config for most
of the RVV instructions in order to elide redundant VSETVLs.
For exmaple:
for
for
for
if (cond}
VSETVL demand 1: SEW/LMUL = 16 and TU policy
else
VSETVL demand 2: SEW = 32
VSETVL pass should be able to fuse demand 1 and demand 2 into new demand: SEW = 32, LMUL = M2, TU policy.
Then emit such VSETVL at the outmost of the for loop to get the most optimal codegen and run-time execution.
Currenty the VSETVL PASS Phase 3 (demand fusion) is really messy and un-reliable as well as un-maintainable.
And, I recently read dragon book and morgan's book again, I found there "earliest" can allow us to do the
demand fusion in a very reliable and optimal way.
So, this patch exports these 2 functions which are very helpful for VSETVL pass.
gcc/ChangeLog:
* lcm.cc (compute_antinout_edge): Export as global use.
(compute_earliest): Ditto.
(compute_rev_insert_delete): Ditto.
* lcm.h (compute_antinout_edge): Ditto.
(compute_earliest): Ditto.
Diffstat (limited to 'gcc')
-rw-r--r-- | gcc/lcm.cc | 7 | ||||
-rw-r--r-- | gcc/lcm.h | 3 |
2 files changed, 5 insertions, 5 deletions
@@ -56,9 +56,6 @@ along with GCC; see the file COPYING3. If not see #include "lcm.h" /* Edge based LCM routines. */ -static void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, sbitmap *); -static void compute_earliest (struct edge_list *, int, sbitmap *, sbitmap *, - sbitmap *, sbitmap *, sbitmap *); static void compute_laterin (struct edge_list *, sbitmap *, sbitmap *, sbitmap *, sbitmap *); static void compute_insert_delete (struct edge_list *edge_list, sbitmap *, @@ -79,7 +76,7 @@ static void compute_rev_insert_delete (struct edge_list *edge_list, sbitmap *, This is done based on the flow graph, and not on the pred-succ lists. Other than that, its pretty much identical to compute_antinout. */ -static void +void compute_antinout_edge (sbitmap *antloc, sbitmap *transp, sbitmap *antin, sbitmap *antout) { @@ -170,7 +167,7 @@ compute_antinout_edge (sbitmap *antloc, sbitmap *transp, sbitmap *antin, /* Compute the earliest vector for edge based lcm. */ -static void +void compute_earliest (struct edge_list *edge_list, int n_exprs, sbitmap *antin, sbitmap *antout, sbitmap *avout, sbitmap *kill, sbitmap *earliest) @@ -31,4 +31,7 @@ extern struct edge_list *pre_edge_rev_lcm (int, sbitmap *, sbitmap *, sbitmap *, sbitmap *, sbitmap **, sbitmap **); +extern void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, sbitmap *); +extern void compute_earliest (struct edge_list *, int, sbitmap *, sbitmap *, + sbitmap *, sbitmap *, sbitmap *); #endif /* GCC_LCM_H */ |