aboutsummaryrefslogtreecommitdiff
path: root/flang/lib/Frontend/CompilerInvocation.cpp
diff options
context:
space:
mode:
authorMadhur Amilkanthwar <madhura@nvidia.com>2025-01-21 10:49:19 +0530
committerGitHub <noreply@github.com>2025-01-21 10:49:19 +0530
commit5d281a480e5caae09962b863960d7d057e908a3c (patch)
tree218e4e0b472d139c88c66dd181b691f0e5b68f74 /flang/lib/Frontend/CompilerInvocation.cpp
parentafced70e697e66fb6920b53d489d3fa4498e22dc (diff)
downloadllvm-5d281a480e5caae09962b863960d7d057e908a3c.zip
llvm-5d281a480e5caae09962b863960d7d057e908a3c.tar.gz
llvm-5d281a480e5caae09962b863960d7d057e908a3c.tar.bz2
[LoopInterchange] Constrain number of load/stores in a loop (#118973)
In the current state of the code, the transform computes entries for the dependency matrix until `MaxMemInstrCount` which is 100. After 99th entry, it terminates and thus overall wastes compile-time. It would be nice if we can compute total number of entries upfront and early exit if the number of entries > 100. However, computing the number of entries is not always possible as it depends on two factors: 1. Number of load-store pairs in a loop. 2. Number of common loop levels for each of the pair. This patch constrains the whole computation on the number of loads and stores instructions in the loop. In another approach, I experimented with computing 1 and constraining the number of pairs, but that did not lead to any additional benefit in terms of compile time. However, when other issues are fixed, I can revisit this approach.
Diffstat (limited to 'flang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions