aboutsummaryrefslogtreecommitdiff
path: root/libc
diff options
context:
space:
mode:
authorJoseph Huber <huberjn@outlook.com>2024-03-12 10:40:35 -0500
committerGitHub <noreply@github.com>2024-03-12 10:40:35 -0500
commitc167a2588737613558bd7be4c9280603e89281ac (patch)
tree94e61ea59d0ae06ca85da7f68e8000045b804d3d /libc
parent392436383a52bc5e188bd28bec5bc71b3cb5384a (diff)
downloadllvm-c167a2588737613558bd7be4c9280603e89281ac.zip
llvm-c167a2588737613558bd7be4c9280603e89281ac.tar.gz
llvm-c167a2588737613558bd7be4c9280603e89281ac.tar.bz2
[libc] Fix lane-id utility function not using built-in (#84902)
Summary: Previously we got the lane-id from taking the global thread ID and taking off the bottom 5 bits. This works but is inefficient compared to the NVPTX intrinsic simply dedicated to get this value.
Diffstat (limited to 'libc')
-rw-r--r--libc/src/__support/GPU/nvptx/utils.h2
1 files changed, 1 insertions, 1 deletions
diff --git a/libc/src/__support/GPU/nvptx/utils.h b/libc/src/__support/GPU/nvptx/utils.h
index a92c884..fe9da4e 100644
--- a/libc/src/__support/GPU/nvptx/utils.h
+++ b/libc/src/__support/GPU/nvptx/utils.h
@@ -97,7 +97,7 @@ LIBC_INLINE uint32_t get_lane_size() { return 32; }
/// Returns the id of the thread inside of a CUDA warp executing together.
[[clang::convergent]] LIBC_INLINE uint32_t get_lane_id() {
- return get_thread_id() & (get_lane_size() - 1);
+ return __nvvm_read_ptx_sreg_laneid();
}
/// Returns the bit-mask of active threads in the current warp.