[libc] Enable multiple threads to use RPC on the GPU

The execution model of the GPU expects that groups of threads will execute in lock-step in SIMD fashion. It's both important for performance and correctness that we treat this as the smallest possible granularity for an RPC operation. Thus, we map multiple threads to a single larger buffer and ship that across the wire. This patch makes the necessary changes to support executing the RPC on the GPU with multiple threads. This requires some workarounds to mimic the model when handling the protocol from the CPU. I'm not completely happy with some of the workarounds required, but I think it should work. Uses some of the implementation details from D148191. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D148943
author: Joseph Huber <jhuber6@vols.utk.edu> 2023-05-04 14:53:28 -0500
committer: Joseph Huber <jhuber6@vols.utk.edu> 2023-05-04 19:31:41 -0500
commit: 507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff (patch)
tree: dcd9f8ef610af4a60ead26e721c5d3aead79777b /libc/startup
parent: fe9f557578a565ed01faf75cd07ea4d9b75feeb1 (diff)
download: llvm-507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff.zip
llvm-507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff.tar.gz
llvm-507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff.tar.bz2
2 files changed, 2 insertions, 2 deletions
diff --git a/libc/startup/gpu/amdgpu/start.cpp b/libc/startup/gpu/amdgpu/start.cpp
index ab83ea5..b28ad79 100644
--- a/libc/startup/gpu/amdgpu/start.cpp
+++ b/libc/startup/gpu/amdgpu/start.cpp
@@ -52,7 +52,7 @@ void initialize(int argc, char **argv, char **env, void *in, void *out,
   if (gpu::get_thread_id() == 0 && gpu::get_block_id() == 0) {
     // We need to set up the RPC client first in case any of the constructors
     // require it.
-    rpc::client.reset(&lock, in, out, buffer);
+    rpc::client.reset(gpu::get_lane_size(), &lock, in, out, buffer);
 
     // We want the fini array callbacks to be run after other atexit
     // callbacks are run. So, we register them before running the init
diff --git a/libc/startup/gpu/nvptx/start.cpp b/libc/startup/gpu/nvptx/start.cpp
index fe09666..9ed7559 100644
--- a/libc/startup/gpu/nvptx/start.cpp
+++ b/libc/startup/gpu/nvptx/start.cpp
@@ -57,7 +57,7 @@ void initialize(int argc, char **argv, char **env, void *in, void *out,
   if (gpu::get_thread_id() == 0 && gpu::get_block_id() == 0) {
     // We need to set up the RPC client first in case any of the constructors
     // require it.
-    rpc::client.reset(&lock, in, out, buffer);
+    rpc::client.reset(gpu::get_lane_size(), &lock, in, out, buffer);
 
     // We want the fini array callbacks to be run after other atexit
     // callbacks are run. So, we register them before running the init
author	Joseph Huber <jhuber6@vols.utk.edu>	2023-05-04 14:53:28 -0500
committer	Joseph Huber <jhuber6@vols.utk.edu>	2023-05-04 19:31:41 -0500
commit	507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff (patch)
tree	dcd9f8ef610af4a60ead26e721c5d3aead79777b /libc/startup
parent	fe9f557578a565ed01faf75cd07ea4d9b75feeb1 (diff)
download	llvm-507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff.zip llvm-507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff.tar.gz llvm-507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff.tar.bz2