diff options
| author | Joseph Huber <huberjn@outlook.com> | 2025-10-14 09:35:53 -0500 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2025-10-14 09:35:53 -0500 |
| commit | 4a35c4d38af4844f26d944047ca1f6aefd6a0eff (patch) | |
| tree | 697481ec93c853e5e151c7de9e632aad7f25c126 /clang/lib/Serialization/ModuleCache.cpp | |
| parent | 4c0692edb445c5d90a189f5c12e5433b8e84a713 (diff) | |
| download | llvm-4a35c4d38af4844f26d944047ca1f6aefd6a0eff.zip llvm-4a35c4d38af4844f26d944047ca1f6aefd6a0eff.tar.gz llvm-4a35c4d38af4844f26d944047ca1f6aefd6a0eff.tar.bz2 | |
[Offload] Lazily initialize platforms in the Offloading API (#163272)
Summary:
The Offloading library wraps around the underlying plugins. The problem
is that we currently initialize all plugins we find, even if they are
not needed for the program. This is very expensive for trivial uses, as
fully heterogenous usage is quite rare. In practice this means that you
will always pay a 200 ms penalty for having CUDA installed.
This patch changes the behavior to provide accessors into the plugins
and devices that allows them to be initialized lazily. We use a
once_flag, this should properly take a fast-path check while still
blocking on concurrent use.
Making full use of this will require a way to filter platforms more
specifically. I'm thinking of what this would look like as an API.
I'm thinking that we either have an extra iterate function that takes a
callback on the platform, or we just provide a helper to find all the
devices that can run a given image. Maybe both?
Fixes: https://github.com/llvm/llvm-project/issues/159636
Diffstat (limited to 'clang/lib/Serialization/ModuleCache.cpp')
0 files changed, 0 insertions, 0 deletions
