aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Frontend/CompilerInvocation.cpp
diff options
context:
space:
mode:
authorTobias Grosser <tobias@grosser.es>2015-10-20 09:12:21 +0000
committerTobias Grosser <tobias@grosser.es>2015-10-20 09:12:21 +0000
commitca7f5bb7674faef830dcb668dcd91d17f51f01b4 (patch)
tree4189e96155e561ba6a84ae36d4b582315981dac7 /clang/lib/Frontend/CompilerInvocation.cpp
parent648a2c37fbcfbd0fc13128887131f72a2625819d (diff)
downloadllvm-ca7f5bb7674faef830dcb668dcd91d17f51f01b4.zip
llvm-ca7f5bb7674faef830dcb668dcd91d17f51f01b4.tar.gz
llvm-ca7f5bb7674faef830dcb668dcd91d17f51f01b4.tar.bz2
Full/partial tile separation for vectorization
We isolate full tiles from partial tiles to be able to, for example, vectorize loops with parametric lower and/or upper bounds. If we use -polly-vectorizer=stripmine, we can see execution-time improvements: correlation from 1m7361s to 0m5720s (-67.05 %), covariance from 1m5561s to 0m5680s (-63.50 %), ary3 from 2m3201s to 1m2361s (-46.72 %), CrystalMk from 8m5565s to 7m4285s (-13.18 %). The current full/partial tile separation increases compile-time more than necessary. As a result, we see in compile time regressions, for example, for 3mm from 0m6320s to 0m9881s (56.34%). Some of this compile time increase is expected as we generate more IR and consequently more time is spent in the LLVM backends. However, a first investiagation has shown that a larger portion of compile time is unnecessarily spent inside Polly's parallelism detection and could be eliminated by propagating existing knowledge about vector loop parallelism. Before enabling -polly-vectorizer=stripmine by default, it is necessary to address this compile-time issue. Contributed-by: Roman Gareev <gareevroman@gmail.com> Reviewers: jdoerfert, grosser Subscribers: grosser, #polly Differential Revision: http://reviews.llvm.org/D13779 llvm-svn: 250809
Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions