diff options
author | Joseph Huber <huberjn@outlook.com> | 2025-07-28 11:05:36 -0500 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-07-28 11:05:36 -0500 |
commit | b2322772f2ab97de60db906a591353a5ef77cdfe (patch) | |
tree | 9fb4045205877747854312aeb5336a328082b35e /clang/lib/CodeGen/CodeGenModule.cpp | |
parent | 701de35f67201cb39cf22bf3835c345e55014f3c (diff) | |
download | llvm-b2322772f2ab97de60db906a591353a5ef77cdfe.zip llvm-b2322772f2ab97de60db906a591353a5ef77cdfe.tar.gz llvm-b2322772f2ab97de60db906a591353a5ef77cdfe.tar.bz2 |
[libc] Reduce reference counter to a 32-bit integer (#150961)
Summary:
This reference counter tracks how many threads are using a given slab.
Currently it's a 64-bit integer, this patch reduces it to a 32-bit
integer. The benefit of this is that we save a few registers now that we
no longer need to use two for these operations. This increases the risk
of overflow, but given that the largest value we accept for a single
slab is ~131,000 it is a long way off of the maximum of four billion or
so. Obviously we can oversubscribe the reference count by having threads
attempt to claim the lock and then try to free it, but I assert that it
is exceedingly unlikely that we will somehow have over four billion GPU
threads stalled in the same place.
A later optimization could be done to split the reference counter and
pointers into a struct of arrays, that will save 128 KiB of static
memory (as we currently use 512 KiB for the slab array).
Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.cpp')
0 files changed, 0 insertions, 0 deletions