diff options
Diffstat (limited to 'llvm/docs')
-rw-r--r-- | llvm/docs/CommandGuide/index.rst | 1 | ||||
-rw-r--r-- | llvm/docs/CommandGuide/llvm-offload-binary.rst | 185 | ||||
-rw-r--r-- | llvm/docs/LangRef.rst | 18 | ||||
-rw-r--r-- | llvm/docs/OptBisect.rst | 42 | ||||
-rw-r--r-- | llvm/docs/ReleaseNotes.md | 6 |
5 files changed, 231 insertions, 21 deletions
diff --git a/llvm/docs/CommandGuide/index.rst b/llvm/docs/CommandGuide/index.rst index f85f32a..8f080de 100644 --- a/llvm/docs/CommandGuide/index.rst +++ b/llvm/docs/CommandGuide/index.rst @@ -92,6 +92,7 @@ Developer Tools llvm-pdbutil llvm-profgen llvm-tli-checker + llvm-offload-binary Remarks Tools ~~~~~~~~~~~~~~ diff --git a/llvm/docs/CommandGuide/llvm-offload-binary.rst b/llvm/docs/CommandGuide/llvm-offload-binary.rst new file mode 100644 index 0000000..960b12d --- /dev/null +++ b/llvm/docs/CommandGuide/llvm-offload-binary.rst @@ -0,0 +1,185 @@ +llvm-offload-binary - LLVM Offload Binary Packager +================================================== + +.. program:: llvm-offload-binary + +SYNOPSIS +-------- + +:program:`llvm-offload-binary` [*options*] [*input files...*] + +DESCRIPTION +----------- + +:program:`llvm-offload-binary` is a utility for bundling multiple device object +files into a single binary container. The resulting binary can then be embedded +into the host section table to form a fat binary containing offloading code for +different targets. Conversely, it can also extract previously bundled device +images. + +The binary format begins with the magic bytes ``0x10FF10AD``, followed by a +version and size. Each binary contains its own header, allowing tools to locate +offloading sections even when merged by a linker. Each offload entry includes +metadata such as the device image kind, producer kind, and key-value string +metadata. Multiple offloading images are concatenated to form a fat binary. + +EXAMPLE +------- + +.. code-block:: console + + # Package multiple device images into a fat binary: + $ llvm-offload-binary -o out.bin \ + --image=file=input.o,triple=nvptx64,arch=sm_70 + + # Extract a matching image from a fat binary: + $ llvm-offload-binary in.bin \ + --image=file=output.o,triple=nvptx64,arch=sm_70 + + # Extract and archive images into a static library: + $ llvm-offload-binary in.bin --archive -o libdevice.a + +OPTIONS +------- + +.. option:: --archive + + When extracting from an input binary, write all extracted images into a static + archive instead of separate files. + +.. option:: --image=<<key>=<value>,...> + + Specify a set of arbitrary key-value arguments describing an image. + Commonly used optional keys include ``arch`` (e.g. ``sm_70`` for CUDA) and + ``triple`` (e.g. nvptx64-nvidia-cuda). + +.. option:: -o <file> + + Write output to <file>. When bundling, this specifies the fat binary filename. + When extracting, this specifies the archive or output file destination. + +.. option:: --help, -h + + Display available options. Use ``--help-hidden`` to show hidden options. + +.. option:: --help-list + + Display a list of all options. Use ``--help-list-hidden`` to show hidden ones. + +.. option:: --version + + Display the version of the :program:`llvm-offload-binary` executable. + +.. option:: @<FILE> + + Read command-line options from response file `<FILE>`. + +BINARY FORMAT +------------- + +The binary format is marked by the magic bytes ``0x10FF10AD``, followed by a +version number. Each created binary contains its own header. This allows tools +to locate offloading sections even after linker operations such as relocatable +linking. Conceptually, this binary format is a serialization of a string map and +an image buffer. + +.. table:: Offloading Binary Header + :name: table-binary_header + + +----------+--------------+----------------------------------------------------+ + | Type | Identifier | Description | + +==========+==============+====================================================+ + | uint8_t | magic | The magic bytes for the binary format (0x10FF10AD) | + +----------+--------------+----------------------------------------------------+ + | uint32_t | version | Version of this format (currently version 1) | + +----------+--------------+----------------------------------------------------+ + | uint64_t | size | Size of this binary in bytes | + +----------+--------------+----------------------------------------------------+ + | uint64_t | entry offset | Absolute offset of the offload entries in bytes | + +----------+--------------+----------------------------------------------------+ + | uint64_t | entry size | Size of the offload entries in bytes | + +----------+--------------+----------------------------------------------------+ + +Each offload entry describes a bundled image along with its associated metadata. + +.. table:: Offloading Entry Table + :name: table-binary_entry + + +----------+---------------+----------------------------------------------------+ + | Type | Identifier | Description | + +==========+===============+====================================================+ + | uint16_t | image kind | The kind of the device image (e.g. bc, cubin) | + +----------+---------------+----------------------------------------------------+ + | uint16_t | offload kind | The producer of the image (e.g. openmp, cuda) | + +----------+---------------+----------------------------------------------------+ + | uint32_t | flags | Generic flags for the image | + +----------+---------------+----------------------------------------------------+ + | uint64_t | string offset | Absolute offset of the string metadata table | + +----------+---------------+----------------------------------------------------+ + | uint64_t | num strings | Number of string entries in the table | + +----------+---------------+----------------------------------------------------+ + | uint64_t | image offset | Absolute offset of the device image in bytes | + +----------+---------------+----------------------------------------------------+ + | uint64_t | image size | Size of the device image in bytes | + +----------+---------------+----------------------------------------------------+ + +The entry table refers to both a string table and the raw device image itself. +The string table provides arbitrary key-value metadata. + +.. table:: Offloading String Entry + :name: table-binary_string + + +----------+--------------+-------------------------------------------------------+ + | Type | Identifier | Description | + +==========+==============+=======================================================+ + | uint64_t | key offset | Absolute byte offset of the key in the string table | + +----------+--------------+-------------------------------------------------------+ + | uint64_t | value offset | Absolute byte offset of the value in the string table | + +----------+--------------+-------------------------------------------------------+ + +The string table is a collection of null-terminated strings stored in the image. +Offsets allow string entries to be interpreted as key-value pairs, enabling +flexible metadata such as architecture or target triple. + +The enumerated values for ``image kind`` and ``offload kind`` are: + +.. table:: Image Kind + :name: table-image_kind + + +---------------+-------+---------------------------------------+ + | Name | Value | Description | + +===============+=======+=======================================+ + | IMG_None | 0x00 | No image information provided | + +---------------+-------+---------------------------------------+ + | IMG_Object | 0x01 | The image is a generic object file | + +---------------+-------+---------------------------------------+ + | IMG_Bitcode | 0x02 | The image is an LLVM-IR bitcode file | + +---------------+-------+---------------------------------------+ + | IMG_Cubin | 0x03 | The image is a CUDA object file | + +---------------+-------+---------------------------------------+ + | IMG_Fatbinary | 0x04 | The image is a CUDA fatbinary file | + +---------------+-------+---------------------------------------+ + | IMG_PTX | 0x05 | The image is a CUDA PTX file | + +---------------+-------+---------------------------------------+ + +.. table:: Offload Kind + :name: table-offload_kind + + +------------+-------+---------------------------------------+ + | Name | Value | Description | + +============+=======+=======================================+ + | OFK_None | 0x00 | No offloading information provided | + +------------+-------+---------------------------------------+ + | OFK_OpenMP | 0x01 | The producer was OpenMP offloading | + +------------+-------+---------------------------------------+ + | OFK_CUDA | 0x02 | The producer was CUDA | + +------------+-------+---------------------------------------+ + | OFK_HIP | 0x03 | The producer was HIP | + +------------+-------+---------------------------------------+ + | OFK_SYCL | 0x04 | The producer was SYCL | + +------------+-------+---------------------------------------+ + +SEE ALSO +-------- + +:manpage:`clang(1)`, :manpage:`llvm-objdump(1)` diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 20bd811..6d0e828 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -2529,6 +2529,9 @@ For example: if the attributed function is called during invocation of a function attributed with ``sanitize_realtime``. This attribute is incompatible with the ``sanitize_realtime`` attribute. +``sanitize_alloc_token`` + This attribute indicates that implicit allocation token instrumentation + is enabled for this function. ``speculative_load_hardening`` This attribute indicates that `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_ @@ -8577,6 +8580,21 @@ Example: The ``nofree`` metadata indicates the memory pointed by the pointer will not be freed after the attached instruction. +'``alloc_token``' Metadata +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``alloc_token`` metadata may be attached to calls to memory allocation +functions, and contains richer semantic information about the type of the +allocation. This information is consumed by the ``alloc-token`` pass to +instrument such calls with allocation token IDs. + +The metadata contains a string with the type of an allocation. + +.. code-block:: none + + call ptr @malloc(i64 64), !alloc_token !0 + + !0 = !{!"<type-name>"} Module Flags Metadata ===================== diff --git a/llvm/docs/OptBisect.rst b/llvm/docs/OptBisect.rst index 0e4d31a..e3ba078 100644 --- a/llvm/docs/OptBisect.rst +++ b/llvm/docs/OptBisect.rst @@ -8,7 +8,7 @@ Using -opt-bisect-limit to debug optimization errors Introduction ============ -The -opt-bisect-limit option provides a way to disable all optimization passes +The ``-opt-bisect-limit`` option provides a way to disable all optimization passes above a specified limit without modifying the way in which the Pass Managers are populated. The intention of this option is to assist in tracking down problems where incorrect transformations during optimization result in incorrect @@ -19,10 +19,10 @@ skipped while still allowing correct code generation call a function to check the opt-bisect limit before performing optimizations. Passes which either must be run or do not modify the IR do not perform this check and are therefore never skipped. Generally, this means analysis passes, passes -that are run at CodeGenOptLevel::None and passes which are required for register +that are run at ``CodeGenOptLevel::None`` and passes which are required for register allocation. -The -opt-bisect-limit option can be used with any tool, including front ends +The ``-opt-bisect-limit`` option can be used with any tool, including front ends such as clang, that uses the core LLVM library for optimization and code generation. The exact syntax for invoking the option is discussed below. @@ -36,7 +36,7 @@ transformations that is difficult to replicate with tools like opt and llc. Getting Started =============== -The -opt-bisect-limit command line option can be passed directly to tools such +The ``-opt-bisect-limit`` command-line option can be passed directly to tools such as opt, llc and lli. The syntax is as follows: :: @@ -49,17 +49,17 @@ indicating the index value that is associated with that optimization. To skip optimizations, pass the value of the last optimization to be performed as the opt-bisect-limit. All optimizations with a higher index value will be skipped. -In order to use the -opt-bisect-limit option with a driver that provides a +In order to use the ``-opt-bisect-limit`` option with a driver that provides a wrapper around the LLVM core library, an additional prefix option may be required, as defined by the driver. For example, to use this option with -clang, the "-mllvm" prefix must be used. A typical clang invocation would look +clang, the ``-mllvm`` prefix must be used. A typical clang invocation would look like this: :: clang -O2 -mllvm -opt-bisect-limit=256 my_file.c -The -opt-bisect-limit option may also be applied to link-time optimizations by +The ``-opt-bisect-limit`` option may also be applied to link-time optimizations by using a prefix to indicate that this is a plug-in option for the linker. The following syntax will set a bisect limit for LTO transformations: @@ -72,11 +72,11 @@ following syntax will set a bisect limit for LTO transformations: LTO passes are run by a library instance invoked by the linker. Therefore any passes run in the primary driver compilation phase are not affected by options -passed via '-Wl,-plugin-opt' and LTO passes are not affected by options -passed to the driver-invoked LLVM invocation via '-mllvm'. +passed via ``-Wl,-plugin-opt`` and LTO passes are not affected by options +passed to the driver-invoked LLVM invocation via ``-mllvm``. Passing ``-opt-bisect-print-ir-path=path/foo.ll`` will dump the IR to -``path/foo.ll`` when -opt-bisect-limit starts skipping passes. +``path/foo.ll`` when ``-opt-bisect-limit`` starts skipping passes. Bisection Index Values ====================== @@ -85,7 +85,7 @@ The granularity of the optimizations associated with a single index value is variable. Depending on how the optimization pass has been instrumented the value may be associated with as much as all transformations that would have been performed by an optimization pass on an IR unit for which it is invoked -(for instance, during a single call of runOnFunction for a FunctionPass) or as +(for instance, during a single call of ``runOnFunction`` for a ``FunctionPass``) or as little as a single transformation. The index values may also be nested so that if an invocation of the pass is not skipped individual transformations within that invocation may still be skipped. @@ -99,7 +99,7 @@ is not a problem. When an opt-bisect index value refers to an entire invocation of the run function for a pass, the pass will query whether or not it should be skipped each time it is invoked and each invocation will be assigned a unique value. -For example, if a FunctionPass is used with a module containing three functions +For example, if a ``FunctionPass`` is used with a module containing three functions a different index value will be assigned to the pass for each of the functions as the pass is run. The pass may be run on two functions but skipped for the third. @@ -144,13 +144,13 @@ Example Usage Pass Skipping Implementation ============================ -The -opt-bisect-limit implementation depends on individual passes opting in to -the opt-bisect process. The OptBisect object that manages the process is +The ``-opt-bisect-limit`` implementation depends on individual passes opting in to +the opt-bisect process. The ``OptBisect`` object that manages the process is entirely passive and has no knowledge of how any pass is implemented. When a -pass is run if the pass may be skipped, it should call the OptBisect object to +pass is run if the pass may be skipped, it should call the ``OptBisect`` object to see if it should be skipped. -The OptBisect object is intended to be accessed through LLVMContext and each +The ``OptBisect`` object is intended to be accessed through ``LLVMContext`` and each Pass base class contains a helper function that abstracts the details in order to make this check uniform across all passes. These helper functions are: @@ -160,7 +160,7 @@ to make this check uniform across all passes. These helper functions are: bool FunctionPass::skipFunction(const Function &F); bool LoopPass::skipLoop(const Loop *L); -A MachineFunctionPass should use FunctionPass::skipFunction() as such: +A ``MachineFunctionPass`` should use ``FunctionPass::skipFunction()`` as such: .. code-block:: c++ @@ -170,11 +170,11 @@ A MachineFunctionPass should use FunctionPass::skipFunction() as such: // Otherwise, run the pass normally. } -In addition to checking with the OptBisect class to see if the pass should be -skipped, the skipFunction(), skipLoop() and skipBasicBlock() helper functions -also look for the presence of the "optnone" function attribute. The calling +In addition to checking with the ``OptBisect`` class to see if the pass should be +skipped, the ``skipFunction()``, ``skipLoop()`` and ``skipBasicBlock()`` helper functions +also look for the presence of the ``optnone`` function attribute. The calling pass will be unable to determine whether it is being skipped because the -"optnone" attribute is present or because the opt-bisect-limit has been +``optnone`` attribute is present or because the ``opt-bisect-limit`` has been reached. This is desirable because the behavior should be the same in either case. diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md index 85c16b9c..79d93d0 100644 --- a/llvm/docs/ReleaseNotes.md +++ b/llvm/docs/ReleaseNotes.md @@ -146,6 +146,8 @@ Changes to the Python bindings Changes to the C API -------------------- +* Add `LLVMGetOrInsertFunction` to get or insert a function, replacing the combination of `LLVMGetNamedFunction` and `LLVMAddFunction`. + Changes to the CodeGen infrastructure ------------------------------------- @@ -177,6 +179,10 @@ Changes to Sanitizers Other Changes ------------- +* Introduces the `AllocToken` pass, an instrumentation pass providing tokens to + memory allocators enabling various heap organization strategies, such as heap + partitioning. + External Open Source Projects Using LLVM {{env.config.release}} =============================================================== |