aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
diff options
context:
space:
mode:
authorMatt Arsenault <Matthew.Arsenault@amd.com>2020-04-24 10:06:00 -0400
committerMatt Arsenault <Matthew.Arsenault@amd.com>2020-04-24 15:53:30 -0400
commit35e6a9c8397e9550382961f2e020987982e9ccd7 (patch)
treec26d64777f2865b76234647436306f8067f0186e /llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
parent0d671dbca949e70536da371a43dc55248301705f (diff)
downloadllvm-35e6a9c8397e9550382961f2e020987982e9ccd7.zip
llvm-35e6a9c8397e9550382961f2e020987982e9ccd7.tar.gz
llvm-35e6a9c8397e9550382961f2e020987982e9ccd7.tar.bz2
AMDGPU: Break read2/write2 search range on a memory fence
This is to fix performance regressions introduced by 86c944d790728891801778b8d98c2c65a83f36a5. The old search would collect all potentially mergeable instructions in the entire block. In this case, the same address is written in multiple places in the block on the other side of a fence. When sorted by offset, the two unmergeable, identical addresses would be next to each other and the merge would give up. Break the search space when we encounter an instruction we won't be able to merge across. This will keep the identical addresses in different merge attempts. This may also improve compile time by reducing the merge list size.
Diffstat (limited to 'llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp')
0 files changed, 0 insertions, 0 deletions