aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
diff options
context:
space:
mode:
authorChristudasan Devadasan <Christudasan.Devadasan@amd.com>2021-12-24 15:05:41 -0500
committerChristudasan Devadasan <Christudasan.Devadasan@amd.com>2022-01-06 00:27:11 -0500
commit50b5b367c1ae72be5265f81b4dba03b3deb0c4e4 (patch)
tree1e6222d8773e9bf0e33399ad7a2445090eb902ae /llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
parent31b79b86ee3defa07f1aa4fa5a10d2389ec527dd (diff)
downloadllvm-50b5b367c1ae72be5265f81b4dba03b3deb0c4e4.zip
llvm-50b5b367c1ae72be5265f81b4dba03b3deb0c4e4.tar.gz
llvm-50b5b367c1ae72be5265f81b4dba03b3deb0c4e4.tar.bz2
[AMDGPU] Iterate LoweredEndCf in the reverse order
The function that optimally inserts the exec mask restore operations by combining the blocks currently visits the lowered END_CF pseudos in the forward direction as it iterates the setvector in the order the entries are inserted in it. Due to the absence of BranchFolding at -O0, the irregularly placed BBs cause the forward traversal to incorrectly place two unconditional branches in certain BBs while combining them, especially when an intervening block later gets optimized away in subsequent iterations. It is avoided by reverse iterating the setvector. The blocks at the bottom of a function will get optimized first before processing those at the top. Fixes: SWDEV-315215 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D116273
Diffstat (limited to 'llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp')
-rw-r--r--llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp2
1 files changed, 1 insertions, 1 deletions
diff --git a/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp b/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
index 3168bcd..6ec37b3 100644
--- a/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
+++ b/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
@@ -582,7 +582,7 @@ void SILowerControlFlow::optimizeEndCf() {
if (!RemoveRedundantEndcf)
return;
- for (MachineInstr *MI : LoweredEndCf) {
+ for (MachineInstr *MI : reverse(LoweredEndCf)) {
MachineBasicBlock &MBB = *MI->getParent();
auto Next =
skipIgnoreExecInstsTrivialSucc(MBB, std::next(MI->getIterator()));