rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Joseph Huber <huberjn@outlook.com>	2025-07-26 22:31:50 -0500
committer	Joseph Huber <huberjn@outlook.com>	2025-07-28 09:23:29 -0500
commit	9975dfdf800d9881b704a988bc004ec81639fe67 (patch)
tree	e2a2a661a6695e7b886d60f3f7899bbc5c8a764a /clang/lib/Frontend/CompilerInvocation.cpp
parent	166493d6927026c4933be82de81adabc9751c0e3 (diff)
download	llvm-9975dfdf800d9881b704a988bc004ec81639fe67.zip llvm-9975dfdf800d9881b704a988bc004ec81639fe67.tar.gz llvm-9975dfdf800d9881b704a988bc004ec81639fe67.tar.bz2

[libc] Small performance improvements to GPU allocator

Summary: This slightly increases performance in a few places. First, we optimistically assume the cached slab has ample space which lets us avoid the atomic load on the highly contended counter in the case that it is likely to succeed. Second, we no longer call `match_any` twice as we can calculate the uniform slabs at the moment we return them. Thirdly, we always choose a random index on a 32-bit boundary. This means that in the fast case we fulfil the allocation with a single `fetch_or`, and in the other case we quickly move to the free bit. This nets around a 7.75% improvement for the fast path case.

Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: