aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
diff options
context:
space:
mode:
authorRoman Gareev <gareevroman@gmail.com>2016-12-21 12:37:36 +0000
committerRoman Gareev <gareevroman@gmail.com>2016-12-21 12:37:36 +0000
commitbd5c6039c69482fcfa35a5861c11edc0c5b6f032 (patch)
tree199237ba6c8c1c7b2ca96aa8d83adbea377ac8e7 /llvm/lib/Bitcode/Reader/BitcodeReader.cpp
parent7116dc908cf7c6b2be04d327d5e69d12fb0c0a46 (diff)
downloadllvm-bd5c6039c69482fcfa35a5861c11edc0c5b6f032.zip
llvm-bd5c6039c69482fcfa35a5861c11edc0c5b6f032.tar.gz
llvm-bd5c6039c69482fcfa35a5861c11edc0c5b6f032.tar.bz2
Align newly created arrays to the first level cache line boundary
Aligning data to cache lines boundaries helps to avoid overheads related to an access to it ([1]). This patch aligns newly created arrays and adds an option to specify the first level cache line size. By default we use 64 bytes, which is a typical cache-line size ([2]). In case of Intel Core i7-3820 SandyBridge and the following options, clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME -march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true -DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8 -mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm -polly-target-latency-vector-fma=8 it helps to improve the performance from 11.303 GFlops/sec (39,247% of theoretical peak) to 12.63 GFlops/sec (43,8542% of theoretical peak). Refs.: [1] - http://www.alexonlinux.com/aligned-vs-unaligned-memory-access [2] - http://igoro.com/archive/gallery-of-processor-cache-effects/ Differential Revision: https://reviews.llvm.org/D28020 Reviewed-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 290253
Diffstat (limited to 'llvm/lib/Bitcode/Reader/BitcodeReader.cpp')
0 files changed, 0 insertions, 0 deletions