diff options
author | Roman Gareev <gareevroman@gmail.com> | 2016-12-21 12:37:36 +0000 |
---|---|---|
committer | Roman Gareev <gareevroman@gmail.com> | 2016-12-21 12:37:36 +0000 |
commit | bd5c6039c69482fcfa35a5861c11edc0c5b6f032 (patch) | |
tree | 199237ba6c8c1c7b2ca96aa8d83adbea377ac8e7 /llvm/lib/Bitcode/Reader/BitcodeReader.cpp | |
parent | 7116dc908cf7c6b2be04d327d5e69d12fb0c0a46 (diff) | |
download | llvm-bd5c6039c69482fcfa35a5861c11edc0c5b6f032.zip llvm-bd5c6039c69482fcfa35a5861c11edc0c5b6f032.tar.gz llvm-bd5c6039c69482fcfa35a5861c11edc0c5b6f032.tar.bz2 |
Align newly created arrays to the first level cache line boundary
Aligning data to cache lines boundaries helps to avoid overheads related to
an access to it ([1]). This patch aligns newly created arrays and adds an
option to specify the first level cache line size. By default we use 64 bytes,
which is a typical cache-line size ([2]).
In case of Intel Core i7-3820 SandyBridge and the following options,
clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME
-march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true
-DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8
-mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm
-polly-target-latency-vector-fma=8
it helps to improve the performance from 11.303 GFlops/sec (39,247% of
theoretical peak) to 12.63 GFlops/sec (43,8542% of theoretical peak).
Refs.:
[1] - http://www.alexonlinux.com/aligned-vs-unaligned-memory-access
[2] - http://igoro.com/archive/gallery-of-processor-cache-effects/
Differential Revision: https://reviews.llvm.org/D28020
Reviewed-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 290253
Diffstat (limited to 'llvm/lib/Bitcode/Reader/BitcodeReader.cpp')
0 files changed, 0 insertions, 0 deletions