diff options
author | Joseph Huber <jhuber6@vols.utk.edu> | 2023-05-04 14:53:28 -0500 |
---|---|---|
committer | Joseph Huber <jhuber6@vols.utk.edu> | 2023-05-04 19:31:41 -0500 |
commit | 507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff (patch) | |
tree | dcd9f8ef610af4a60ead26e721c5d3aead79777b /libc/startup | |
parent | fe9f557578a565ed01faf75cd07ea4d9b75feeb1 (diff) | |
download | llvm-507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff.zip llvm-507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff.tar.gz llvm-507edb52f9a9a5c1ab2a92ec2e291a7b63c3fbff.tar.bz2 |
[libc] Enable multiple threads to use RPC on the GPU
The execution model of the GPU expects that groups of threads will
execute in lock-step in SIMD fashion. It's both important for
performance and correctness that we treat this as the smallest possible
granularity for an RPC operation. Thus, we map multiple threads to a
single larger buffer and ship that across the wire.
This patch makes the necessary changes to support executing the RPC on
the GPU with multiple threads. This requires some workarounds to mimic
the model when handling the protocol from the CPU. I'm not completely
happy with some of the workarounds required, but I think it should work.
Uses some of the implementation details from D148191.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D148943
Diffstat (limited to 'libc/startup')
-rw-r--r-- | libc/startup/gpu/amdgpu/start.cpp | 2 | ||||
-rw-r--r-- | libc/startup/gpu/nvptx/start.cpp | 2 |
2 files changed, 2 insertions, 2 deletions
diff --git a/libc/startup/gpu/amdgpu/start.cpp b/libc/startup/gpu/amdgpu/start.cpp index ab83ea5..b28ad79 100644 --- a/libc/startup/gpu/amdgpu/start.cpp +++ b/libc/startup/gpu/amdgpu/start.cpp @@ -52,7 +52,7 @@ void initialize(int argc, char **argv, char **env, void *in, void *out, if (gpu::get_thread_id() == 0 && gpu::get_block_id() == 0) { // We need to set up the RPC client first in case any of the constructors // require it. - rpc::client.reset(&lock, in, out, buffer); + rpc::client.reset(gpu::get_lane_size(), &lock, in, out, buffer); // We want the fini array callbacks to be run after other atexit // callbacks are run. So, we register them before running the init diff --git a/libc/startup/gpu/nvptx/start.cpp b/libc/startup/gpu/nvptx/start.cpp index fe09666..9ed7559 100644 --- a/libc/startup/gpu/nvptx/start.cpp +++ b/libc/startup/gpu/nvptx/start.cpp @@ -57,7 +57,7 @@ void initialize(int argc, char **argv, char **env, void *in, void *out, if (gpu::get_thread_id() == 0 && gpu::get_block_id() == 0) { // We need to set up the RPC client first in case any of the constructors // require it. - rpc::client.reset(&lock, in, out, buffer); + rpc::client.reset(gpu::get_lane_size(), &lock, in, out, buffer); // We want the fini array callbacks to be run after other atexit // callbacks are run. So, we register them before running the init |