aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineBlockPlacement.cpp
diff options
context:
space:
mode:
authorMingming Liu <mingmingl@google.com>2024-10-02 10:23:54 -0700
committerGitHub <noreply@github.com>2024-10-02 10:23:54 -0700
commit34f0edd50992e6d18c80dd901caf1e8220be673b (patch)
treef078c59704f602b2e426b3d2060dab1ae1672d85 /llvm/lib/CodeGen/MachineBlockPlacement.cpp
parent694fd1f297feaf59cd29a3d17e63ee2f6514dd16 (diff)
downloadllvm-34f0edd50992e6d18c80dd901caf1e8220be673b.zip
llvm-34f0edd50992e6d18c80dd901caf1e8220be673b.tar.gz
llvm-34f0edd50992e6d18c80dd901caf1e8220be673b.tar.bz2
[TypeProf][PGO]Support skipping vtable comparisons for a class and its derived ones (#110575)
Performance critical core libraries could be highly-optimized for arch or micro-arch features. For instance, the absl crc library specializes different templated classes among different hardwares [1]. In a practical setting, it's likely that instrumented profiles are collected on one type of machine and used to optimize binaries that run on multiple types of hardwares. While this kind of specialization is rare in terms of lines of code, compiler can do a better job to skip vtable-based ICP. * The per-class `Extend` implementation is arch-specific as well. If an instrumented profile is collected on one arch and applied to another arch where `Extend` implementation is different, `Extend` might be regarded as unlikely function in the latter case. `ABSL_ATTRIBUTE_HOT` annotation alleviates the problem by putting all `Extend` implementation into the hot text section [2] This change introduces a comma-separated list to specify the mangled vtable names, and ICP pass will skip vtable-based comparison if a vtable variable definition is shown to be in its class hierarchy (per LLVM type metadata). [1] https://github.com/abseil/abseil-cpp/blob/c6b27359c3d27438b1313dddd7598914c1274a50/absl/crc/internal/crc_x86_arm_combined.cc#L621-L650 [2] https://github.com/abseil/abseil-cpp/blame/c6b27359c3d27438b1313dddd7598914c1274a50/absl/crc/internal/crc_x86_arm_combined.cc#L370C3-L370C21
Diffstat (limited to 'llvm/lib/CodeGen/MachineBlockPlacement.cpp')
0 files changed, 0 insertions, 0 deletions