aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenFunction.cpp
diff options
context:
space:
mode:
authorSergei Barannikov <barannikov88@gmail.com>2024-11-19 17:46:48 +0300
committerGitHub <noreply@github.com>2024-11-19 17:46:48 +0300
commitaff98e4be05a1060e489ce62a88ee0ff365e571a (patch)
tree165b2a8f869fe859b2eca4e27cdad39416e3d5d4 /clang/lib/CodeGen/CodeGenFunction.cpp
parent27dcae53eb9ea7b4d722d650e63567ca54e12d7d (diff)
downloadllvm-aff98e4be05a1060e489ce62a88ee0ff365e571a.zip
llvm-aff98e4be05a1060e489ce62a88ee0ff365e571a.tar.gz
llvm-aff98e4be05a1060e489ce62a88ee0ff365e571a.tar.bz2
[ARM] Stop gluing 1-bit shifts (#116547)
1. When two (or more) nodes are glued, DAG scheduler will always schedule them as one piece, i.e. it will not allow any instructions to be scheduled between them. It does so because if nodes are glued this usually means that there is an implicit register dependency between them, and an intervening node could clobber this physical register. When emitting such nodes into machine IR, they will also be stuck together, e.g.: ``` %9:gpr = MOVsrl_glue killed %8, implicit-def $cpsr %10:gpr = RRX %3, implicit $cpsr ``` 2. If a node has Glue result, SelectionDAG will not try to CSE this node. If it did, it would break the implicit physical register dependency. In practice this means that if a node with Glue result has multiple uses, it has to be duplicated before each use. This the reason for `ARMTargetLowering::duplicateCmp` to exist. When using normal data dependency, dependent nodes can freely be scheduled around. If there is a physical register dependency between nodes, the physical register will be copied to/from a virtual register, allowing other nodes to intervene between them. The resulting machine IR might look like this: ``` %9:gpr = LSRs1 killed %8, implicit-def $cpsr %10:gpr = COPY $cpsr %11:gpr = ORRrsi killed %9, %3, 242, 14 /* CC::al */, $noreg, $noreg %12:gpr = BICri killed %11, -2147483648, 14 /* CC::al */, $noreg, $noreg $cpsr = COPY %10 %13:gpr = RRX %3, implicit $cpsr ``` The two copies are likely to be eliminated by register coalescer, given that there are no instructions between them that clobber this physical register. If the copies are unwanted in the first place (they could be expensive or impossible), DAG scheduler will try to avoid inserting them wherever possible, and the resulting machine IR will look like this: ``` %9:gpr = LSRs1 killed %8, implicit-def $cpsr %10:gpr = ORRrsi killed %9, %3, 242, 14 /* CC::al */, $noreg, $noreg %11:gpr = BICri killed %10, -2147483648, 14 /* CC::al */, $noreg, $noreg %12:gpr = RRX %3, implicit $cpsr ``` On ARM, arithmetic operations and LSLS already use the new data flow approach. This patch extends it to include 1-bit shifts. Pull Request: https://github.com/llvm/llvm-project/pull/116547
Diffstat (limited to 'clang/lib/CodeGen/CodeGenFunction.cpp')
0 files changed, 0 insertions, 0 deletions