aboutsummaryrefslogtreecommitdiff
path: root/clang/unittests/InstallAPI/HeaderFileTest.cpp
diff options
context:
space:
mode:
authorAmir Ayupov <aaupov@fb.com>2025-10-01 15:25:34 -0700
committerGitHub <noreply@github.com>2025-10-01 15:25:34 -0700
commit780f69cd922d8925648e11e771e77f0b46190e5b (patch)
treeb9e9b20576e3aeade74d615e102ea0d316f92578 /clang/unittests/InstallAPI/HeaderFileTest.cpp
parent1e4d4bb584a1c35c5f7801c68b9dfccd6130caab (diff)
downloadllvm-780f69cd922d8925648e11e771e77f0b46190e5b.zip
llvm-780f69cd922d8925648e11e771e77f0b46190e5b.tar.gz
llvm-780f69cd922d8925648e11e771e77f0b46190e5b.tar.bz2
[Clang][CMake] Add CSSPGO support to LLVM_BUILD_INSTRUMENTED (#79942)
Build on Clang-BOLT infrastructure to collect sample profile for CSSPGO. Add CSSPGO.cmake and BOLT-CSSPGO.cmake to automate CSSPGO/+BOLT Clang builds. Note that `CLANG_PGO_TRAINING_DATA_SOURCE_DIR` is required as built-in training set is inadequate for collecting sampled profile. Hardware compatibility: CSSPGO requires synchronized (0-skid) call and branch stacks, which is only available with Intel PEBS (Sandy Bridge+), AMD Zen3 with BRS, Zen4 with LBRv2+LBR_PMC_FREEZE, and Zen5 with LBRv2. This patch adds support for Intel `br_inst_retired.near_taken:uppp` event. Test Plan: Added BOLT-CSSPGO.cmake with same use as BOLT-PGO.cmake, e.g. for bootstrapped ThinLTO+CSSPGO+BOLT, with CSSPGO profile collected from LLVM build, and BOLT profile collected from Hello World (instrumentation): ``` cmake -B clang-csspgo-bolt -S /path/to/llvm-project/llvm \ -DLLVM_ENABLE_LLD=ON -DBOOTSTRAP_LLVM_ENABLE_LLD=ON \ -DBOOTSTRAP_BOOTSTRAP_LLVM_ENABLE_LLD=ON \ -DPGO_INSTRUMENT_LTO=Thin \ -DBOOTSTRAP_CLANG_PGO_TRAINING_DATA_SOURCE_DIR=/path/to/llvm-project/llvm \ -GNinja -C /path/to/llvm-project/clang/cmake/caches/BOLT-CSSPGO.cmake ninja stage2-clang-bolt ... warning: Sample PGO is estimated to optimize better with 19.5x more samples. Please consider increasing sampling rate or profiling for longer duration to get more samples. ... [2800/2801] Optimizing Clang with BOLT BOLT-INFO: 8189 out of 106942 functions in the binary (7.7%) have non-empty execution profile 13776393 : taken branches (-42.1%) ``` Performance testing with Clang: - Setup: Clang-BOLT testing harness https://github.com/aaupov/llvm-devmtg-2022/commit/9f2b46f67a1930a51c58a0e4894637a8c64c570e - CSSPGO training: building LLVM, - InstrPGO training: building Hello World, - BOLT training: building Hello World, instrumentation, - benchmark: building small LLVM tool (not), - 2S Intel SKX Xeon 6138 with 40C/80T and 256GB RAM, using 20C/40T for build, - Results, wall time, lower is better - Baseline (bootstrapped build): 10.36s, - InstrPGO + ThinLTO: 9.34s, - CSSPGO + ThinLTO: 8.85s. - BOLT results, for reference: - Baseline: 9.09s, - InstrPGO + ThinLTO: 9.09s, - CSSPGO + ThinLTO: 8.58s. --------- Co-authored-by: Matthias Braun <matze@braunis.de>
Diffstat (limited to 'clang/unittests/InstallAPI/HeaderFileTest.cpp')
0 files changed, 0 insertions, 0 deletions