From 316eaa3008a80add0e39cc0ab538c04c595a31d3 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 11 Oct 2022 12:00:06 -0500 Subject: [OpenMP][Docs] Add documentation for linking OpenMP with CUDA/HIP Summary: This patch adds an entry to the FAQ that shows how to link CUDA with OpenMP. --- openmp/docs/SupportAndFAQ.rst | 48 ++++++++++++++++++++++++++++++------------- 1 file changed, 34 insertions(+), 14 deletions(-) (limited to 'openmp') diff --git a/openmp/docs/SupportAndFAQ.rst b/openmp/docs/SupportAndFAQ.rst index 4ce64a4..dc1ad83 100644 --- a/openmp/docs/SupportAndFAQ.rst +++ b/openmp/docs/SupportAndFAQ.rst @@ -333,28 +333,28 @@ occurs. Q: Can OpenMP offloading compile for multiple architectures? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Since LLVM version 15.0, OpenMP offloading supports offloading to multiple -architectures at once. This allows for executables to be run on different -targets, such as offloading to AMD and NVIDIA GPUs simultaneously, as well as -multiple sub-architectures for the same target. Additionally, static libraries -will only extract archive members if an architecture is used, allowing users to +Since LLVM version 15.0, OpenMP offloading supports offloading to multiple +architectures at once. This allows for executables to be run on different +targets, such as offloading to AMD and NVIDIA GPUs simultaneously, as well as +multiple sub-architectures for the same target. Additionally, static libraries +will only extract archive members if an architecture is used, allowing users to create generic libraries. -The architecture can either be specified manually using ``--offload-arch=``. If -``--offload-arch=`` is present no ``-fopenmp-targets=`` flag is present then the -targets will be inferred from the architectures. Conversely, if -``--fopenmp-targets=`` is present with no ``--offload-arch`` then the target -architecture will be set to a default value, usually the architecture supported +The architecture can either be specified manually using ``--offload-arch=``. If +``--offload-arch=`` is present no ``-fopenmp-targets=`` flag is present then the +targets will be inferred from the architectures. Conversely, if +``--fopenmp-targets=`` is present with no ``--offload-arch`` then the target +architecture will be set to a default value, usually the architecture supported by the system LLVM was built on. -For example, an executable can be built that runs on AMDGPU and NVIDIA hardware +For example, an executable can be built that runs on AMDGPU and NVIDIA hardware given that the necessary build tools are installed for both. .. code-block:: shell clang example.c -fopenmp --offload-arch=gfx90a --offload-arch=sm_80 -If just given the architectures we should be able to infer the triples, +If just given the architectures we should be able to infer the triples, otherwise we can specify them manually. .. code-block:: shell @@ -363,7 +363,7 @@ otherwise we can specify them manually. -Xopenmp-target=amdgcn-amd-amdhsa --offload-arch=gfx90a \ -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_80 -When linking against a static library that contains device code for multiple +When linking against a static library that contains device code for multiple architectures, only the images used by the executable will be extracted. .. code-block:: shell @@ -372,7 +372,7 @@ architectures, only the images used by the executable will be extracted. llvm-ar rcs libexample.a example.o clang app.c -fopenmp --offload-arch=gfx90a -o app -The supported device images can be viewed using the ``--offloading`` option with +The supported device images can be viewed using the ``--offloading`` option with ``llvm-objdump``. .. code-block:: shell @@ -393,3 +393,23 @@ The supported device images can be viewed using the ``--offloading`` option with arch sm_80 triple nvptx64-nvidia-cuda producer openmp + +Q: Can I link OpenMP offloading with CUDA or HIP? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +OpenMP offloading files can currently be experimentally linked with CUDA and HIP +files. This will allow OpenMP to call a CUDA device function or vice-versa. +However, the global state will be distinct between the two images at runtime. +This means any global variables will potentially have different values when +queried from OpenMP or CUDA. + +Linking CUDA and HIP currently requires enabling a different compilation mode +for CUDA / HIP with ``--offload-new-driver`` and to link using +``--offload-link``. Additionally, ``-fgpu-rdc`` must be used to create a +linkable device image. + +.. code-block:: shell + + clang++ openmp.cpp -fopenmp --offload-arch=sm_80 -c + clang++ cuda.cu --offload-new-driver --offload-arch=sm_80 -fgpu-rdc -c + clang++ openmp.o cuda.o --offload-link -o app -- cgit v1.1