[OpenMP][Docs] Add documentation for linking OpenMP with CUDA/HIP

Summary: This patch adds an entry to the FAQ that shows how to link CUDA with OpenMP.
author: Joseph Huber <jhuber6@vols.utk.edu> 2022-10-11 12:00:06 -0500
committer: Joseph Huber <jhuber6@vols.utk.edu> 2022-10-11 13:40:41 -0500
commit: 316eaa3008a80add0e39cc0ab538c04c595a31d3 (patch)
tree: 7f85d94eaa5f8e9605c25dd1fa34037325c10189
parent: 4b76a80459e69daca2f62f522a6117a9350613dc (diff)
download: llvm-316eaa3008a80add0e39cc0ab538c04c595a31d3.zip
llvm-316eaa3008a80add0e39cc0ab538c04c595a31d3.tar.gz
llvm-316eaa3008a80add0e39cc0ab538c04c595a31d3.tar.bz2
1 files changed, 34 insertions, 14 deletions
diff --git a/openmp/docs/SupportAndFAQ.rst b/openmp/docs/SupportAndFAQ.rst
index 4ce64a4..dc1ad83 100644
--- a/openmp/docs/SupportAndFAQ.rst
+++ b/openmp/docs/SupportAndFAQ.rst
@@ -333,28 +333,28 @@ occurs.
 Q: Can OpenMP offloading compile for multiple architectures?
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Since LLVM version 15.0, OpenMP offloading supports offloading to multiple 
-architectures at once. This allows for executables to be run on different 
-targets, such as offloading to AMD and NVIDIA GPUs simultaneously, as well as 
-multiple sub-architectures for the same target. Additionally, static libraries 
-will only extract archive members if an architecture is used, allowing users to 
+Since LLVM version 15.0, OpenMP offloading supports offloading to multiple
+architectures at once. This allows for executables to be run on different
+targets, such as offloading to AMD and NVIDIA GPUs simultaneously, as well as
+multiple sub-architectures for the same target. Additionally, static libraries
+will only extract archive members if an architecture is used, allowing users to
 create generic libraries.
 
-The architecture can either be specified manually using ``--offload-arch=``. If 
-``--offload-arch=`` is present no ``-fopenmp-targets=`` flag is present then the 
-targets will be inferred from the architectures. Conversely, if 
-``--fopenmp-targets=`` is present with no ``--offload-arch``  then the target 
-architecture will be set to a default value, usually the architecture supported 
+The architecture can either be specified manually using ``--offload-arch=``. If
+``--offload-arch=`` is present no ``-fopenmp-targets=`` flag is present then the
+targets will be inferred from the architectures. Conversely, if
+``--fopenmp-targets=`` is present with no ``--offload-arch``  then the target
+architecture will be set to a default value, usually the architecture supported
 by the system LLVM was built on.
 
-For example, an executable can be built that runs on AMDGPU and NVIDIA hardware 
+For example, an executable can be built that runs on AMDGPU and NVIDIA hardware
 given that the necessary build tools are installed for both.
 
 .. code-block:: shell
 
    clang example.c -fopenmp --offload-arch=gfx90a --offload-arch=sm_80
 
-If just given the architectures we should be able to infer the triples, 
+If just given the architectures we should be able to infer the triples,
 otherwise we can specify them manually.
 
 .. code-block:: shell
@@ -363,7 +363,7 @@ otherwise we can specify them manually.
       -Xopenmp-target=amdgcn-amd-amdhsa --offload-arch=gfx90a \
       -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_80
 
-When linking against a static library that contains device code for multiple 
+When linking against a static library that contains device code for multiple
 architectures, only the images used by the executable will be extracted.
 
 .. code-block:: shell
@@ -372,7 +372,7 @@ architectures, only the images used by the executable will be extracted.
    llvm-ar rcs libexample.a example.o
    clang app.c -fopenmp --offload-arch=gfx90a -o app
 
-The supported device images can be viewed using the ``--offloading`` option with 
+The supported device images can be viewed using the ``--offloading`` option with
 ``llvm-objdump``.
 
 .. code-block:: shell
@@ -393,3 +393,23 @@ The supported device images can be viewed using the ``--offloading`` option with
    arch            sm_80
    triple          nvptx64-nvidia-cuda
    producer        openmp
+
+Q: Can I link OpenMP offloading with CUDA or HIP?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+OpenMP offloading files can currently be experimentally linked with CUDA and HIP
+files. This will allow OpenMP to call a CUDA device function or vice-versa.
+However, the global state will be distinct between the two images at runtime.
+This means any global variables will potentially have different values when
+queried from OpenMP or CUDA.
+
+Linking CUDA and HIP currently requires enabling a different compilation mode
+for CUDA / HIP with ``--offload-new-driver`` and to link using
+``--offload-link``. Additionally, ``-fgpu-rdc`` must be used to create a
+linkable device image.
+
+.. code-block:: shell
+
+   clang++ openmp.cpp -fopenmp --offload-arch=sm_80 -c
+   clang++ cuda.cu --offload-new-driver --offload-arch=sm_80 -fgpu-rdc -c
+   clang++ openmp.o cuda.o --offload-link -o app
author	Joseph Huber <jhuber6@vols.utk.edu>	2022-10-11 12:00:06 -0500
committer	Joseph Huber <jhuber6@vols.utk.edu>	2022-10-11 13:40:41 -0500
commit	316eaa3008a80add0e39cc0ab538c04c595a31d3 (patch)
tree	7f85d94eaa5f8e9605c25dd1fa34037325c10189
parent	4b76a80459e69daca2f62f522a6117a9350613dc (diff)
download	llvm-316eaa3008a80add0e39cc0ab538c04c595a31d3.zip llvm-316eaa3008a80add0e39cc0ab538c04c595a31d3.tar.gz llvm-316eaa3008a80add0e39cc0ab538c04c595a31d3.tar.bz2