diff options
Diffstat (limited to 'llvm/docs')
-rw-r--r-- | llvm/docs/AArch64SME.rst | 24 | ||||
-rw-r--r-- | llvm/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.rst | 62 | ||||
-rw-r--r-- | llvm/docs/AMDGPUUsage.rst | 24 | ||||
-rw-r--r-- | llvm/docs/CMakeLists.txt | 22 | ||||
-rw-r--r-- | llvm/docs/CallGraphSection.md | 6 | ||||
-rw-r--r-- | llvm/docs/CodeOfConduct.rst | 1 | ||||
-rw-r--r-- | llvm/docs/CommandGuide/dsymutil.rst | 8 | ||||
-rw-r--r-- | llvm/docs/DirectX/DXILResources.rst | 89 | ||||
-rw-r--r-- | llvm/docs/GettingStartedVS.rst | 13 | ||||
-rw-r--r-- | llvm/docs/HowToBuildOnARM.rst | 18 | ||||
-rw-r--r-- | llvm/docs/HowToReleaseLLVM.rst | 82 | ||||
-rw-r--r-- | llvm/docs/LangRef.rst | 55 | ||||
-rw-r--r-- | llvm/docs/MergeFunctions.rst | 46 | ||||
-rw-r--r-- | llvm/docs/QualGroup.rst | 15 | ||||
-rw-r--r-- | llvm/docs/ReleaseNotes.md | 14 | ||||
-rw-r--r-- | llvm/docs/SPIRVUsage.rst | 2 | ||||
-rw-r--r-- | llvm/docs/TableGen/BackEnds.rst | 50 |
17 files changed, 333 insertions, 198 deletions
diff --git a/llvm/docs/AArch64SME.rst b/llvm/docs/AArch64SME.rst index 47ed7bc..327f9dc 100644 --- a/llvm/docs/AArch64SME.rst +++ b/llvm/docs/AArch64SME.rst @@ -124,7 +124,7 @@ In this table, we use the following abbreviations: either 0 or 1 on entry, and is unchanged on return). Functions with ``__attribute__((arm_locally_streaming))`` are excluded from this -table because for the caller the attribute is synonymous to 'streaming', and +table because for the caller the attribute is synonymous with 'streaming', and for the callee it is merely an implementation detail that is explicitly not exposed to the caller. @@ -158,7 +158,7 @@ the function's body, so that it can place the mode changes in exactly the right position. The suitable place to do this seems to be SelectionDAG, where it lowers the call's arguments/return values to implement the specified calling convention. SelectionDAG provides Chains and Glue to specify the order of operations and give -preliminary control over the instruction's scheduling. +preliminary control over instruction scheduling. Example of preserving state @@ -232,8 +232,8 @@ implement transitions from ``SC -> N`` and ``SC -> S``. Unchained Function calls ------------------------ When a function with "``aarch64_pstate_sm_enabled``" calls a function that is not -streaming compatible, the compiler has to insert a SMSTOP before the call and -insert a SMSTOP after the call. +streaming compatible, the compiler has to insert an SMSTOP before the call and +insert an SMSTOP after the call. If the function that is called is an intrinsic with no side-effects which in turn is lowered to a function call (e.g., ``@llvm.cos()``), then the call to @@ -388,7 +388,7 @@ The value of PSTATE.SM is not controlled by the feature flags, but rather by the function attributes. This means that we can compile for '``+sme``', and the compiler will code-generate any instructions, even if they are not legal under the requested streaming mode. The compiler needs to use the function attributes to ensure the -compiler doesn't do transformations under the assumption that certain operations +compiler doesn't perform transformations under the assumption that certain operations are available at runtime. We made a conscious choice not to model this with feature flags because we @@ -399,11 +399,11 @@ and `D121208 <https://reviews.llvm.org/D121208>`_) because of limitations in TableGen. As a first step, this means we'll disable vectorization (LoopVectorize/SLP) -entirely when the a function has either of the ``aarch64_pstate_sm_enabled``, +entirely when a function has either of the ``aarch64_pstate_sm_enabled``, ``aarch64_pstate_sm_body`` or ``aarch64_pstate_sm_compatible`` attributes, in order to avoid the use of vector instructions. -Later on we'll aim to relax these restrictions to enable scalable +Later on, we'll aim to relax these restrictions to enable scalable auto-vectorization with a subset of streaming-compatible instructions, but that requires changes to the CostModel, Legalization and SelectionDAG lowering. @@ -416,7 +416,7 @@ Other things to consider ------------------------ * Inlining must be disabled when the call-site needs to toggle PSTATE.SM or - when the callee's function body is executed in a different streaming mode than + when the callee's function body is executed in a different streaming mode from its caller. This is needed because function calls are the boundaries for streaming mode changes. @@ -434,8 +434,8 @@ lazy-save mechanism for calls to private-ZA functions (i.e. functions that may either directly or indirectly clobber ZA state). For the purpose of handling functions marked with ``aarch64_new_za``, -we have introduced a new LLVM IR pass (SMEABIPass) that is run just before -SelectionDAG. Any such functions dealt with by this pass are marked with +we have introduced a new LLVM IR pass (SMEABIPass) that runs just before +SelectionDAG. Any such functions handled by this pass are marked with ``aarch64_expanded_pstate_za``. Setting up a lazy-save @@ -458,7 +458,7 @@ AArch64 Predicate-as-Counter Type The predicate-as-counter type represents the type of a predicate-as-counter value held in an AArch64 SVE predicate register. Such a value contains information about the number of active lanes, the element width and a bit that -tells whether the generated mask should be inverted. ACLE intrinsics should be +indicates whether the generated mask should be inverted. ACLE intrinsics should be used to move the predicate-as-counter value to/from a predicate vector. There are certain limitations on the type: @@ -466,7 +466,7 @@ There are certain limitations on the type: * The type can be used for function parameters and return values. * The supported LLVM operations on this type are limited to ``load``, ``store``, - ``phi``, ``select`` and ``alloca`` instructions. + ``phi``, ``select``, and ``alloca`` instructions. The predicate-as-counter type is a scalable type. diff --git a/llvm/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.rst b/llvm/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.rst index ba670d3..f472b862 100644 --- a/llvm/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.rst +++ b/llvm/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.rst @@ -37,13 +37,13 @@ includes contributions to open source projects such as LLVM [:ref:`LLVM The LLVM compiler has upstream support for commercially available AMD GPU hardware (AMDGPU) [:ref:`AMDGPU-LLVM <amdgpu-dwarf-AMDGPU-LLVM>`]. The open -source ROCgdb [:ref:`AMD-ROCgdb <amdgpu-dwarf-AMD-ROCgdb>`] GDB based debugger +source ROCgdb [:ref:`AMD-ROCgdb <amdgpu-dwarf-AMD-ROCgdb>`] GDB-based debugger also has support for AMDGPU which is being upstreamed. Support for AMDGPU is also being added by third parties to the GCC [:ref:`GCC <amdgpu-dwarf-GCC>`] compiler and the Perforce TotalView HPC Debugger [:ref:`Perforce-TotalView <amdgpu-dwarf-Perforce-TotalView>`]. -To support debugging heterogeneous programs several features that are not +To support debugging heterogeneous programs, several features that are not provided by current DWARF Version 5 [:ref:`DWARF <amdgpu-dwarf-DWARF>`] have been identified. The :ref:`amdgpu-dwarf-extensions` section gives an overview of the extensions devised to address the missing features. The extensions seek to @@ -107,7 +107,7 @@ for each in terms of heterogeneous debugging. DWARF Version 5 does not allow location descriptions to be entries on the DWARF expression stack. They can only be the final result of the evaluation of a DWARF expression. However, by allowing a location description to be a first-class -entry on the DWARF expression stack it becomes possible to compose expressions +entry on the DWARF expression stack, it becomes possible to compose expressions containing both values and location descriptions naturally. It allows objects to be located in any kind of memory address space, in registers, be implicit values, be undefined, or a composite of any of these. @@ -123,20 +123,20 @@ non-default address spaces and generalizing the power of composite location descriptions to any kind of location description. For those familiar with the definition of location descriptions in DWARF Version -5, the definitions in these extensions are presented differently, but does in +5, the definitions in these extensions are presented differently, but do in fact define the same concept with the same fundamental semantics. However, it does so in a way that allows the concept to extend to support address spaces, bit addressing, the ability for composite location descriptions to be composed of any kind of location description, and the ability to support objects located at multiple places. Collectively these changes expand the set of architectures -that can be supported and improves support for optimized code. +that can be supported and improve support for optimized code. Several approaches were considered, and the one presented, together with the extensions it enables, appears to be the simplest and cleanest one that offers the greatest improvement of DWARF's ability to support debugging optimized GPU and non-GPU code. Examining the GDB debugger and LLVM compiler, it appears only to require modest changes as they both already have to support general use of -location descriptions. It is anticipated that will also be the case for other +location descriptions. It is anticipated that this will also be the case for other debuggers and compilers. GDB has been modified to evaluate DWARF Version 5 expressions with location @@ -156,7 +156,7 @@ DWARF Expression Stack* [:ref:`AMDGPU-DWARF-LOC 2.2 Generalize CFI to Allow Any Location Description Kind --------------------------------------------------------- -CFI describes restoring callee saved registers that are spilled. Currently CFI +CFI describes restoring callee saved registers that are spilled. Currently, CFI only allows a location description that is a register, memory address, or implicit location description. AMDGPU optimized code may spill scalar registers into portions of vector registers. This requires extending CFI to allow any @@ -223,7 +223,7 @@ infinite precision offsets to allow it to correctly track a series of positive and negative offsets that may transiently overflow or underflow, but end up in range. This is simple for the arithmetic operations as they are defined in terms of two's complement arithmetic on a base type of a fixed size. Therefore, the -offset operation define that integer overflow is ill-formed. This is in contrast +offset operation defines that integer overflow is ill-formed. This is in contrast to the ``DW_OP_plus``, ``DW_OP_plus_uconst``, and ``DW_OP_minus`` arithmetic operations which define that it causes wrap-around. @@ -359,7 +359,7 @@ address space at a fixed address. The ``DW_OP_LLVM_form_aspace_address`` (see :ref:`amdgpu-dwarf-memory-location-description-operations`) operation is defined -to create a memory location description from an address and address space. If +to create a memory location description from an address and address space. It can be used to specify the location of a variable that is allocated in a specific address space. This allows the size of addresses in an address space to be larger than the generic type. It also allows a consumer great implementation @@ -372,7 +372,7 @@ In contrast, if the ``DW_OP_LLVM_form_aspace_address`` operation had been defined to produce a value, and an implicit conversion to a memory location description was defined, then it would be limited to the size of the generic type (which matches the size of the default address space). An implementation -would likely have to use *reserved ranges* of value to represent different +would likely have to use *reserved ranges* of values to represent different address spaces. Such a value would likely not match any address value in the actual hardware. That would require the consumer to have special treatment for such values. @@ -528,7 +528,7 @@ active. To describe the conceptual location of non-active lanes requires an attribute that has an expression that computes the source location PC for each lane. -For efficiency, the expression calculates the source location the wavefront as a +For efficiency, the expression calculates the source location of the wavefront as a whole. This can be done using the ``DW_OP_LLVM_select_bit_piece`` (see :ref:`amdgpu-dwarf-operation-to-create-vector-composite-location-descriptions`) operation. @@ -564,7 +564,7 @@ information entry to indicate that there is additional target architecture specific information in the debugging information entries of that compilation unit. This allows a consumer to know what extensions are present in the debugger information entries as is possible with the augmentation string of other -sections. See . +sections. The format that should be used for an augmentation string is also recommended. This allows a consumer to parse the string when it contains information from @@ -581,7 +581,7 @@ See :ref:`amdgpu-dwarf-full-and-partial-compilation-unit-entries`, AMDGPU supports programming languages that include online compilation where the source text may be created at runtime. For example, the OpenCL and HIP language -runtimes support online compilation. To support is, a way to embed the source +runtimes support online compilation. To support this, a way to embed the source text in the debug information is provided. See :ref:`amdgpu-dwarf-line-number-information`. @@ -589,16 +589,16 @@ See :ref:`amdgpu-dwarf-line-number-information`. 2.17 Allow MD5 Checksums to be Optionally Present ------------------------------------------------- -In DWARF Version 5 the file timestamp and file size can be optional, but if the -MD5 checksum is present it must be valid for all files. This is a problem if +In DWARF Version 5, the file timestamp and file size can be optional, but if the +MD5 checksum is present, it must be valid for all files. This is a problem if using link time optimization to combine compilation units where some have MD5 -checksums and some do not. Therefore, sSupport to allow MD5 checksums to be -optionally present in the line table is added. +checksums, and others do not. Therefore, the line table is extended to allow MD5 +checksums to be optional. See :ref:`amdgpu-dwarf-line-number-information`. -2.18 Add the HIP Programing Language ------------------------------------- +2.18 Add the HIP Programming Language +------------------------------------- The HIP programming language [:ref:`HIP <amdgpu-dwarf-HIP>`], which is supported by the AMDGPU, is added. @@ -617,7 +617,7 @@ hardware to allow a single instruction to execute multiple iterations using vector registers. Note that although this is similar to SIMT execution, the way a client debugger -uses the information is fundamentally different. In SIMT execution the debugger +uses the information is fundamentally different. In SIMT execution, the debugger needs to present the concurrent execution as distinct source language threads that the user can list and switch focus between. With iteration concurrency optimizations, such as software pipelining and vectorized SIMD, the debugger @@ -648,7 +648,7 @@ language loop iterations are executing concurrently. See It is common in SIMD vectorization for the compiler to generate code that promotes portions of an array into vector registers. For example, if the hardware has vector registers with 8 elements, and 8 wide SIMD instructions, the -compiler may vectorize a loop so that is executes 8 iterations concurrently for +compiler may vectorize a loop so that it executes 8 iterations concurrently for each vectorized loop iteration. On the first iteration of the generated vectorized loop, iterations 0 to 7 of @@ -691,7 +691,7 @@ Inside the loop body, the machine code loads ``src[i]`` and ``dst[i]`` into registers, adds them, and stores the result back into ``dst[i]``. Considering the location of ``dst`` and ``src`` in the loop body, the elements -``dst[i]`` and ``src[i]`` would be located in registers, all other elements are +``dst[i]`` and ``src[i]`` would be located in registers; all other elements are located in memory. Let register ``R0`` contain the base address of ``dst``, register ``R1`` contain ``i``, and register ``R2`` contain the registerized ``dst[i]`` element. We can describe the location of ``dst`` as a memory location @@ -722,7 +722,7 @@ with a register location overlaid at a runtime offset involving ``i``: ---------------------------------------------- AMDGPU supports languages, such as OpenCL, that define source language memory -spaces. Support is added to define language specific memory spaces so they can +spaces. Support is added to define language-specific memory spaces so they can be used in a consistent way by consumers. See :ref:`amdgpu-dwarf-memory-spaces`. A new attribute ``DW_AT_LLVM_memory_space`` is added to support using memory @@ -738,9 +738,9 @@ accommodates only 32 unique operations. In practice, the lack of a central registry and a desire for backwards compatibility means vendor extensions are never retired, even when standard versions are accepted into DWARF proper. This has produced a situation where the effective encoding space available for new -vendor extensions is miniscule today. +vendor extensions is minuscule today. -To expand this encoding space a new DWARF operation ``DW_OP_LLVM_user`` is +To expand this encoding space, a new DWARF operation ``DW_OP_LLVM_user`` is added which acts as a "prefix" for vendor extensions. It is followed by a ULEB128 encoded vendor extension opcode, which is then followed by the operands of the corresponding vendor extension operation. @@ -776,7 +776,7 @@ A. Changes Relative to DWARF Version 5 .. note:: Notes are included to describe how the changes are to be applied to the - DWARF Version 5 standard. They also describe rational and issues that may + DWARF Version 5 standard. They also describe rationale and issues that may need further consideration. A.2 General Description @@ -898,7 +898,7 @@ elements that can be specified are: *A current lane* - The 0 based SIMT lane identifier to be used in evaluating a user presented + The 0-based SIMT lane identifier to be used in evaluating a user presented expression. This applies to source languages that are implemented for a target architecture using a SIMT execution model. These implementations map source language threads of execution to lanes of the target architecture threads. @@ -917,7 +917,7 @@ elements that can be specified are: *A current iteration* - The 0 based source language iteration instance to be used in evaluating a user + The 0-based source language iteration instance to be used in evaluating a user presented expression. This applies to target architectures that support optimizations that result in executing multiple source language loop iterations concurrently. @@ -1845,7 +1845,7 @@ There are these special value operations currently defined: interpreted as a value of T. If a conversion is wanted it can be done explicitly using a ``DW_OP_convert`` operation. - GDB has a per register hook that allows a target specific conversion on a + GDB has a per register hook that allows a target-specific conversion on a register by register basis. It defaults to truncation of bigger registers. Removing use of the target hook does not cause any test failures in common architectures. If the compiler for a target architecture did want some @@ -1855,7 +1855,7 @@ There are these special value operations currently defined: If T is a larger type than the register size, then the default GDB register hook reads bytes from the next register (or reads out of bounds for the last register!). Removing use of the target hook does not cause - any test failures in common architectures (except an illegal hand written + any test failures in common architectures (except an illegal hand-written assembly test). If a target architecture requires this behavior, these extensions allow a composite location description to be used to combine multiple registers. @@ -2283,7 +2283,7 @@ bit offset equal to V scaled by 8 (the byte size). The implicit conversion could also be defined as target architecture specific. For example, GDB checks if V is an integral type. If it is not it gives an error. Otherwise, GDB zero-extends V to 64 bits. If the GDB target defines a - hook function, then it is called. The target specific hook function can modify + hook function, then it is called. The target-specific hook function can modify the 64-bit value, possibly sign extending based on the original value type. Finally, GDB treats the 64-bit value V as a memory location address. diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst index a4d110f..e062032 100644 --- a/llvm/docs/AMDGPUUsage.rst +++ b/llvm/docs/AMDGPUUsage.rst @@ -488,21 +488,21 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following **GCN GFX11 (RDNA 3.5)** [AMD-GCN-GFX11-RDNA3.5]_ ----------------------------------------------------------------------------------------------------------------------- - ``gfx1150`` ``amdgcn`` APU - cumode - Architected *TBA* + ``gfx1150`` ``amdgcn`` APU - cumode - Architected Radeon 890M - wavefrontsize64 flat scratch .. TODO:: - Packed work-item Add product IDs names. - ``gfx1151`` ``amdgcn`` APU - cumode - Architected *TBA* + ``gfx1151`` ``amdgcn`` APU - cumode - Architected Radeon 8060S - wavefrontsize64 flat scratch .. TODO:: - Packed work-item Add product IDs names. - ``gfx1152`` ``amdgcn`` APU - cumode - Architected *TBA* + ``gfx1152`` ``amdgcn`` APU - cumode - Architected Radeon 860M - wavefrontsize64 flat scratch .. TODO:: - Packed @@ -883,6 +883,8 @@ supported for the ``amdgcn`` target. Buffer Fat Pointer 7 N/A N/A 160 0 Buffer Resource 8 N/A V# 128 0x00000000000000000000000000000000 Buffer Strided Pointer (experimental) 9 *TODO* + *reserved for downstream use* 10 + *reserved for downstream use* 11 Streamout Registers 128 N/A GS_REGS ===================================== =============== =========== ================ ======= ============================ @@ -4172,7 +4174,7 @@ non-AMD key names should be prefixed by "*vendor-name*.". "Image", or "Pipe". This may be more restrictive than indicated by "AccQual" to reflect what the - kernel actual does. If not + kernel actually does. If not present then the runtime must assume what is implied by "AccQual" and "IsConst". Values @@ -5436,8 +5438,8 @@ The fields used by CP for code objects before V3 also match those specified in ``COMPUTE_PGM_RSRC1.PRIORITY``. 13:12 2 bits FLOAT_ROUND_MODE_32 Wavefront starts execution with specified rounding - mode for single (32 - bit) floating point + mode for single (32-bit) + floating point precision floating point operations. @@ -5769,7 +5771,7 @@ The fields used by CP for code objects before V3 also match those specified in Wavefront starts execution with memory violation - exceptions exceptions + exceptions enabled which are generated when a memory violation has occurred for this wavefront from @@ -6005,7 +6007,7 @@ The fields used by CP for code objects before V3 also match those specified in FLOAT_DENORM_MODE_FLUSH_NONE 3 No Flush ====================================== ===== ==================================== - Denormal flushing is sign respecting. i.e. the behavior expected by + Denormal flushing is sign respecting, i.e., the behavior expected by ``"denormal-fp-math"="preserve-sign"``. The behavior is undefined with ``"denormal-fp-math"="positive-zero"`` @@ -16831,7 +16833,7 @@ For GFX125x: * Some memory operations contain a ``nv`` bit, for "non-volatile", which indicates memory that is not expected to change during a kernel's execution. This information is propagated to the cache lines for that address - (refered to as ``$nv``). + (referred to as ``$nv``). * When ``nv=0`` reads hit dirty ``$nv=1`` data in cache, the hardware will writeback the data to the next level in the hierarchy and then subsequently read @@ -18970,7 +18972,7 @@ On entry to a function: #. All other registers are unspecified. #. Any necessary ``s_waitcnt`` has been performed to ensure memory is available to the function. -#. Use pass-by-reference (byref) in stead of pass-by-value (byval) for struct +#. Use pass-by-reference (byref) instead of pass-by-value (byval) for struct arguments in C ABI. Callee is responsible for allocating stack memory and copying the value of the struct if modified. Note that the backend still supports byval for struct arguments. @@ -20214,7 +20216,7 @@ from the value of the ``-mcpu`` option that is passed to the assembler. .amdgpu_hsa_kernel (name) +++++++++++++++++++++++++ -This directives specifies that the symbol with given name is a kernel entry +This directive specifies that the symbol with given name is a kernel entry point (label) and the object should contain corresponding symbol of type STT_AMDGPU_HSA_KERNEL. diff --git a/llvm/docs/CMakeLists.txt b/llvm/docs/CMakeLists.txt index b4522e3..fc37c6d 100644 --- a/llvm/docs/CMakeLists.txt +++ b/llvm/docs/CMakeLists.txt @@ -136,17 +136,23 @@ if( NOT uses_ocaml LESS 0 AND LLVM_ENABLE_OCAMLDOC ) list(APPEND odoc_files -load ${odoc_file}) endforeach() - add_custom_target(ocaml_doc - COMMAND ${CMAKE_COMMAND} -E remove_directory ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html - COMMAND ${CMAKE_COMMAND} -E make_directory ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html - COMMAND ${OCAMLFIND} ocamldoc -d ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html - -sort -colorize-code -html ${odoc_files} - COMMAND ${CMAKE_COMMAND} -E copy ${CMAKE_CURRENT_SOURCE_DIR}/_ocamldoc/style.css - ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html) + set(OCAML_DOC_ADD_TO_ALL "") + if(LLVM_BUILD_DOCS) + set(OCAML_DOC_ADD_TO_ALL ALL) + endif() + + add_custom_target(ocaml_doc ${OCAML_DOC_ADD_TO_ALL} + COMMAND ${CMAKE_COMMAND} -E remove_directory ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html + COMMAND ${CMAKE_COMMAND} -E make_directory ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html + COMMAND ${OCAMLFIND} ocamldoc -d ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html + -sort -colorize-code -html ${odoc_files} + COMMAND ${CMAKE_COMMAND} -E copy ${CMAKE_CURRENT_SOURCE_DIR}/_ocamldoc/style.css + ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html) add_dependencies(ocaml_doc ${doc_targets}) - if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY) + + if (NOT LLVM_INSTALL_TOOLCHAIN_ONLY AND LLVM_BUILD_DOCS) # ./ suffix is needed to copy the contents of html directory without # appending html/ into LLVM_INSTALL_OCAMLDOC_HTML_DIR. install(DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}/ocamldoc/html/. diff --git a/llvm/docs/CallGraphSection.md b/llvm/docs/CallGraphSection.md index 8b18727..84d6061 100644 --- a/llvm/docs/CallGraphSection.md +++ b/llvm/docs/CallGraphSection.md @@ -1,10 +1,10 @@ -# .callgraph Section Layout +# .llvm.callgraph Section Layout -The `.callgraph` section is used to store call graph information for each function. The section contains a series of records, with each record corresponding to a single function. +The `.llvm.callgraph` section is used to store call graph information for each function. The section contains a series of records, with each record corresponding to a single function. ## Per Function Record Layout -Each record in the `.callgraph` section has the following binary layout: +Each record in the `.llvm.callgraph` section has the following binary layout: | Field | Type | Size (bits) | Description | | -------------------------------------- | ------------- | ----------- | ------------------------------------------------------------------------------------------------------- | diff --git a/llvm/docs/CodeOfConduct.rst b/llvm/docs/CodeOfConduct.rst index 645ae12..995d32b 100644 --- a/llvm/docs/CodeOfConduct.rst +++ b/llvm/docs/CodeOfConduct.rst @@ -171,6 +171,7 @@ The current committee members are: Transparency Reports ==================== +* `July 15, 2025 <https://discourse.llvm.org/t/llvm-code-of-conduct-transparency-report-july-15-2024-july-15-2025/88622>`_ * `July 15, 2024 <https://discourse.llvm.org/t/llvm-code-of-conduct-transparency-report-july-15-2023-july-15-2024/82687>`_ * `July 15, 2023 <https://llvm.org/coc-reports/2023-07-15-report.html>`_ * `July 15, 2022 <https://llvm.org/coc-reports/2022-07-15-report.html>`_ diff --git a/llvm/docs/CommandGuide/dsymutil.rst b/llvm/docs/CommandGuide/dsymutil.rst index 8764e1f..8e61e01 100644 --- a/llvm/docs/CommandGuide/dsymutil.rst +++ b/llvm/docs/CommandGuide/dsymutil.rst @@ -75,14 +75,6 @@ OPTIONS Make a static variable keep the enclosing function even if it would have been omitted otherwise. -.. option:: --minimize, -z - - When used when creating a dSYM file, this option will suppress the emission of - the .debug_inlines, .debug_pubnames, and .debug_pubtypes sections since - dsymutil currently has better equivalents: .apple_names and .apple_types. When - used in conjunction with ``--update`` option, this option will cause redundant - accelerator tables to be removed. - .. option:: --no-object-timestamp Don't check timestamp for object files. diff --git a/llvm/docs/DirectX/DXILResources.rst b/llvm/docs/DirectX/DXILResources.rst index 91dcd5c8..f253e02f 100644 --- a/llvm/docs/DirectX/DXILResources.rst +++ b/llvm/docs/DirectX/DXILResources.rst @@ -746,3 +746,92 @@ Examples: @llvm.dx.resource.load.cbufferrow.8( target("dx.CBuffer", target("dx.Layout", {i16}, 2, 0)) %buffer, i32 %index) + +Resource dimensions +------------------- + +*relevant types: Textures and Buffer* + +The `getDimensions`_ DXIL operation returns the dimensions of a texture or +buffer resource. It returns a `Dimensions`_ type, which is a struct +containing four ``i32`` values. The values in the struct represent the size +of each dimension of the resource, and when aplicable the number of array +elements or number of samples. The mapping is defined in the +`getDimensions`_ documentation. + +The LLVM IR representation of this operation has several forms +depending on the resource type and the specific ``getDimensions`` query. +The intrinsics return a scalar or anonymous struct with up to 4 `i32` +elements. The intrinsic names include suffixes to indicate the number of +elements in the return value. The suffix `.x` indicates a single `i32` +return value, `.xy` indicates a struct with two `i32` values, and `.xyz` +indicates a struct with three `i32` values. + +Intrinsics representing queries on multisampled texture resources include +`.ms.` in their name and their return value includes an additional `i32` for +the number of samples. + +Intrinsics with `mip_level` argument and `.levels.` in their name are used +for texture resources with multiple MIP levels. Their return +struct includes an additional `i32` for the number of levels the resource has. + +.. code-block:: llvm + + i32 @llvm.dx.resource.getdimensions.x( target("dx.*") handle ) + {i32, i32} @llvm.dx.resource.getdimensions.xy( target("dx.*") handle ) + {i32, i32, i32} @llvm.dx.resource.getdimensions.xyz( target("dx.*") handle ) + {i32, i32} @llvm.dx.resource.getdimensions.levels.x( target("dx.*") handle, i32 mip_level ) + {i32, i32, i32} @llvm.dx.resource.getdimensions.levels.xy( target("dx.*") handle, i32 mip_level ) + {i32, i32, i32, i32} @llvm.dx.resource.getdimensions.levels.xyz( target("dx.*") handle, i32 mip_level ) + {i32, i32, i32} @llvm.dx.resource.getdimensions.ms.xy( target("dx.*") handle ) + {i32, i32, i32, i32} @llvm.dx.resource.getdimensions.ms.xyz( target("dx.*") handle ) + +.. list-table:: ``@llvm.dx.resource.getdimensions.*`` + :header-rows: 1 + + * - Argument + - + - Type + - Description + * - Return value + - + - `i32`, `{i32, i32}`, `{i32, i32, i32}`, or `{i32, i32, i32, i32}` + - Width, height, and depth of the resource (based on the specific suffix), and a number of levels or samples where aplicable. + * - ``%handle`` + - 0 + - ``target(dx.*)`` + - Resource handle + * - ``%mip_level`` + - 1 + - ``i32`` + - MIP level for the requested dimensions. + +Examples: + +.. code-block:: llvm + + ; RWBuffer<float4> + %dim = call i32 @llvm.dx.resource.getdimensions.x(target("dx.TypedBuffer", <4 x float>, 1, 0, 0) %handle) + + ; Texture2D + %0 = call {i32, i32} @llvm.dx.resource.getdimensions.xy(target("dx.Texture", ...) %tex2d) + %tex2d_width = extractvalue {i32, i32} %0, 0 + %tex2d_height = extractvalue {i32, i32} %0, 1 + + ; Texture2DArray with levels + %1 = call {i32, i32, i32, i32} @llvm.dx.resource.getdimensions.levels.xyz( + target("dx.Texture", ...) %tex2darray, i32 1) + %tex2darray_width = extractvalue {i32, i32, i32, i32} %1, 0 + %tex2darray_height = extractvalue {i32, i32, i32, i32} %1, 1 + %tex2darray_elem_count = extractvalue {i32, i32, i32, i32} %1, 2 + %tex2darray_levels_count = extractvalue {i32, i32, i32, i32} %1, 3 + + ; Texture2DMS + %2 = call {i32, i32, i32} @llvm.dx.resource.getdimensions.ms.xy( + target("dx.Texture", ...) %tex2dms) + %tex2dms_width = extractvalue {i32, i32, i32} %2, 0 + %tex2dms_height = extractvalue {i32, i32, i32} %2, 1 + %tex2dms_samples_count = extractvalue {i32, i32, i32} %2, 2 + +.. _Dimensions: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#resource-operation-return-types +.. _getDimensions: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#getdimensions diff --git a/llvm/docs/GettingStartedVS.rst b/llvm/docs/GettingStartedVS.rst index bc5746d..e65fd8f 100644 --- a/llvm/docs/GettingStartedVS.rst +++ b/llvm/docs/GettingStartedVS.rst @@ -126,6 +126,15 @@ These instructions were tested with Visual Studio 2019 and Python 3.9.6: cmake -S llvm\llvm -B build -DLLVM_ENABLE_PROJECTS=clang -DLLVM_TARGETS_TO_BUILD=X86 -Thost=x64 exit + .. note:: + By default, the Visual Studio project files generated by CMake use the + 32-bit toolset. If you are developing on a 64-bit version of Windows and + want to use the 64-bit toolset, pass the ``-Thost=x64`` flag when + generating the Visual Studio solution. This requires CMake 3.8.0 or later. + + For Windows on Arm the equivalent is ``-Thost=ARM64``, but this the default + for those hosts, so you do not have to use this option. + ``LLVM_ENABLE_PROJECTS`` specifies any additional LLVM projects you want to build while ``LLVM_TARGETS_TO_BUILD`` selects the compiler targets. If ``LLVM_TARGETS_TO_BUILD`` is omitted by default all targets are built @@ -149,10 +158,6 @@ These instructions were tested with Visual Studio 2019 and Python 3.9.6: * CMake generates project files for all build types. To select a specific build type, use the Configuration manager from the VS IDE or the ``/property:Configuration`` command-line option when using MSBuild. - * By default, the Visual Studio project files generated by CMake use the - 32-bit toolset. If you are developing on a 64-bit version of Windows and - want to use the 64-bit toolset, pass the ``-Thost=x64`` flag when - generating the Visual Studio solution. This requires CMake 3.8.0 or later. 13. Start Visual Studio and select configuration: diff --git a/llvm/docs/HowToBuildOnARM.rst b/llvm/docs/HowToBuildOnARM.rst index 9eb6b5a..30e3744 100644 --- a/llvm/docs/HowToBuildOnARM.rst +++ b/llvm/docs/HowToBuildOnARM.rst @@ -23,10 +23,10 @@ on the ARMv6 and ARMv7 architectures and may be inapplicable to older chips. choices when using CMake. Autoconf usage is deprecated as of 3.8. Building LLVM/Clang in ``Release`` mode is preferred since it consumes - a lot less memory. Otherwise, the building process will very likely + a lot less memory. Otherwise, the build process will very likely fail due to insufficient memory. It's also a lot quicker to only build the relevant back-ends (ARM and AArch64), since it's very unlikely that - you'll use an ARM board to cross-compile to other arches. If you're + you'll use an ARM board to cross-compile to other architectures. If you're running Compiler-RT tests, also include the x86 back-end, or some tests will fail. @@ -48,15 +48,15 @@ on the ARMv6 and ARMv7 architectures and may be inapplicable to older chips. ``make -jN check-all`` or ``ninja check-all`` will run all compiler tests. For running the test suite, please refer to :doc:`TestingGuide`. -#. If you are building LLVM/Clang on an ARM board with 1G of memory or less, - please use ``gold`` rather then GNU ``ld``. In any case it is probably a good +#. If you are building LLVM/Clang on an ARM board with 1 GB of memory or less, + please use ``gold`` rather than GNU ``ld``. In any case, it is probably a good idea to set up a swap partition, too. .. code-block:: bash $ sudo ln -sf /usr/bin/ld /usr/bin/ld.gold -#. ARM development boards can be unstable and you may experience that cores +#. ARM development boards can be unstable, and you may experience that cores are disappearing, caches being flushed on every big.LITTLE switch, and other similar issues. To help ease the effect of this, set the Linux scheduler to "performance" on **all** cores using this little script: @@ -73,12 +73,12 @@ on the ARMv6 and ARMv7 architectures and may be inapplicable to older chips. problems. #. Running the build on SD cards is ok, but they are more prone to failures - than good quality USB sticks, and those are more prone to failures than - external hard-drives (those are also a lot faster). So, at least, you + than good-quality USB sticks, and those are more prone to failures than + external hard drives (those are also a lot faster). So, at least, you should consider to buy a fast USB stick. On systems with a fast eMMC, that's a good option too. #. Make sure you have a decent power supply (dozens of dollars worth) that can - provide *at least* 4 amperes, this is especially important if you use USB - devices with your board. Externally powered USB/SATA harddrives are even + provide *at least* 4 amperes. This is especially important if you use USB + devices with your board. Externally powered USB/SATA hard drives are even better than having a good power supply. diff --git a/llvm/docs/HowToReleaseLLVM.rst b/llvm/docs/HowToReleaseLLVM.rst index 1795d3a..171bf88 100644 --- a/llvm/docs/HowToReleaseLLVM.rst +++ b/llvm/docs/HowToReleaseLLVM.rst @@ -18,11 +18,11 @@ create the binary packages, please refer to the :doc:`ReleaseProcess` instead. Release Timeline ================ -LLVM is released on a time based schedule --- with major releases roughly +LLVM is released on a time-based schedule --- with major releases roughly every 6 months. In between major releases there may be dot releases. The release manager will determine if and when to make a dot release based on feedback from the community. Typically, dot releases should be made if -there are large number of bug-fixes in the stable branch or a critical bug +there are a large number of bug fixes in the stable branch or a critical bug has been discovered that affects a large number of users. Unless otherwise stated, dot releases will follow the same procedure as @@ -73,7 +73,7 @@ Release Process Summary * Generate and send out the second release candidate sources. Only *critical* bugs found during this testing phase will be fixed. Any bugs introduced by - merged patches will be fixed. If so a third round of testing is needed. + merged patches will be fixed. If so, a third round of testing is needed. * The release notes are updated. @@ -107,15 +107,15 @@ Create Release Branch and Update LLVM Version Branch the Git trunk using the following procedure: #. Remind developers that the release branching is imminent and to refrain from - committing patches that might break the build. E.g., new features, large + committing patches that might break the build, e.g., new features, large patches for works in progress, an overhaul of the type system, an exciting new TableGen feature, etc. #. Verify that the current git trunk is in decent shape by examining nightly tester and buildbot results. -#. Bump the version in trunk to N.0.0git with the script in - ``llvm/utils/release/bump-version.py``, and tag the commit with llvmorg-N-init. +#. Bump the version in trunk to ``N.0.0git`` with the script in + ``llvm/utils/release/bump-version.py``, and tag the commit with ``llvmorg-N-init``. If ``X`` is the version to be released, then ``N`` is ``X + 1``. :: $ git tag -sa llvmorg-N-init @@ -124,14 +124,14 @@ Branch the Git trunk using the following procedure: ``llvm/utils/release/clear-release-notes.py``. #. Create the release branch from the last known good revision from before the - version bump. The branch's name is release/X.x where ``X`` is the major version + version bump. The branch's name is ``release/X.x`` where ``X`` is the major version number and ``x`` is just the letter ``x``. #. On the newly-created release branch, immediately bump the version - to X.1.0git (where ``X`` is the major version of the branch.) + to ``X.1.0git`` (where ``X`` is the major version of the branch.) -#. All tags and branches need to be created in both the llvm/llvm-project and - llvm/llvm-test-suite repos. +#. All tags and branches need to be created in both the ``llvm/llvm-project`` and + ``llvm/llvm-test-suite`` repos. Tagging the LLVM Release Candidates ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -157,7 +157,7 @@ the release page. $ for f in *.xz; do gh attestation verify --owner llvm $f && gpg -b $f; done Tarballs, release binaries, or any other release artifacts must be uploaded to -GitHub. This can be done using the github-upload-release.py script in utils/release. +GitHub. This can be done using the ``github-upload-release.py`` script in ``utils/release``. :: @@ -170,10 +170,10 @@ Build The Binary Distribution Creating the binary distribution requires following the instructions :doc:`here <ReleaseProcess>`. -That process will perform both Release+Asserts and Release builds but only -pack the Release build for upload. You should use the Release+Asserts sysroot, +That process performs both Release+Asserts and Release builds but only packs +the Release build for upload. You should use the Release+Asserts sysroot, normally under ``final/Phase3/Release+Asserts/llvmCore-3.8.1-RCn.install/``, -for test-suite and run-time benchmarks, to make sure nothing serious has +for test-suite and run-time benchmarks, to ensure nothing serious has passed through the net. For compile-time benchmarks, use the Release version. The minimum required version of the tools you'll need are :doc:`here <GettingStarted>` @@ -181,14 +181,14 @@ The minimum required version of the tools you'll need are :doc:`here <GettingSta Release Qualification Criteria ------------------------------ -There are no official release qualification criteria. It is up to the -the release manager to determine when a release is ready. The release manager +There are no official release qualification criteria. +The release manager determines when a release is ready. The release manager should pay attention to the results of community testing, the number of outstanding -bugs, and then number of regressions when determining whether or not to make a +bugs, and the number of regressions when determining whether or not to make a release. The community values time based releases, so releases should not be delayed for -too long unless there are critical issues remaining. In most cases, the only +too long unless critical issues remain. In most cases, the only kind of bugs that are critical enough to block a release would be a major regression from a previous release. @@ -199,33 +199,33 @@ A few developers in the community have dedicated time to validate the release candidates and volunteered to be the official release testers for each architecture. -These will be the ones testing, generating and uploading the official binaries +These will be the ones testing, generating, and uploading the official binaries to the server, and will be the minimum tests *necessary* for the release to proceed. This will obviously not cover all OSs and distributions, so additional community -validation is important. However, if community input is not reached before the -release is out, all bugs reported will have to go on the next stable release. +validation is important. However, if community input is not received before the +release, all reported bugs will be deferred to the next stable release. The official release managers are: * Even releases: Tom Stellard (tstellar@redhat.com) * Odd releases: Tobias Hieta (tobias@hieta.se) -The official release testers are volunteered from the community and have +The official release testers are volunteers from the community who have consistently validated and released binaries for their targets/OSs. To contact them, you should post on the `Discourse forums (Project Infrastructure - Release Testers). <https://discourse.llvm.org/c/infrastructure/release-testers/66>`_ -The official testers list is in the file `RELEASE_TESTERS.TXT +The official testers list is in the file ``RELEASE_TESTERS.TXT`` <https://github.com/llvm/llvm-project/blob/main/llvm/RELEASE_TESTERS.TXT>`_, in the LLVM repository. Community Testing ----------------- -Once all testing has been completed and appropriate bugs filed, the release -candidate tarballs are put on the website and the LLVM community is notified. +Once all testing is complete and appropriate bugs are filed, the release +candidate tarballs are put on the website, and the LLVM community is notified. We ask that all LLVM developers test the release in any the following ways: @@ -251,7 +251,7 @@ We ask that all LLVM developers test the release in any the following ways: architecture. We also ask that the OS distribution release managers test their packages with -the first candidate of every release, and report any *new* errors in GitHub. +the first candidate of every release and report any *new* errors in GitHub. If the bug can be reproduced with an unpatched upstream version of the release candidate (as opposed to the distribution's own build), the priority should be release blocker. @@ -268,10 +268,10 @@ next stage. Reporting Regressions --------------------- -Every regression that is found during the tests (as per the criteria above), +Every regression found during the tests (as per the criteria above) should be filled in a bug in GitHub and added to the release milestone. -If a bug can't be reproduced, or stops being a blocker, it should be removed +If a bug can't be reproduced or stops being a blocker, it should be removed from the Milestone. Debugging can continue, but on trunk. Backport Requests @@ -299,15 +299,15 @@ This section describes how to triage bug reports: to see the list of bugs that are being considered for the release. #. Review each bug and first check if it has been fixed in main. If it has, update - its status to "Needs Pull Request", and create a pull request for the fix - using the /cherry-pick or /branch comments if this has not been done already. + its status to "Needs Pull Request" and create a pull request for the fix + using the ``/cherry-pick`` or ``/branch`` comments if this has not been done already. #. If a bug has been fixed and has a pull request created for backporting it, then update its status to "Needs Review" and notify a knowledgeable reviewer. Usually you will want to notify the person who approved the patch, but you may use your best judgement on who a good reviewer would be. Once you have identified the reviewer(s), assign the issue to them and - mention them (i.e @username) in a comment and ask them if the patch is safe + mention them (i.e., ``@username``) in a comment and ask them if the patch is safe to backport. You should also review the bug yourself to ensure that it meets the requirements for committing to the release branch. @@ -323,11 +323,11 @@ Release Patch Rules Below are the rules regarding patching the release branch: #. Patches applied to the release branch may only be applied by the release - manager, the official release testers or the maintainers with approval from + manager, the official release testers, or the maintainers with approval from the release manager. #. Release managers are encouraged, but not required, to get approval from a - maintainer before approving patches. If there are no reachable maintainers + maintainer before approving patches. If there are no reachable maintainers, then release managers can ask approval from patch reviewers or other developers active in that area. @@ -336,7 +336,7 @@ Below are the rules regarding patching the release branch: was created. As with all phases, release managers and maintainers can reject patches that are deemed too invasive. -#. *Before RC2/RC3* Patches should be limited to bug fixes or backend specific +#. *Before RC2/RC3* Patches should be limited to bug fixes or backend-specific improvements that are determined to be very safe. #. *Before Final Major Release* Patches should be limited to critical @@ -349,7 +349,7 @@ Below are the rules regarding patching the release branch: Release Final Tasks ------------------- -The final stages of the release process involves tagging the "final" release +The final stages of the release process involve tagging the "final" release branch, updating documentation that refers to the release, and updating the demo page. @@ -394,11 +394,11 @@ is what to do: #. Update the ``releases/index.html`` with the new release and link to release documentation. -#. After you push the changes to the www-releases repo, someone with admin - access must login to prereleases-origin.llvm.org and manually pull the new - changes into /data/www-releases/. This is where the website is served from. +#. After you push the changes to the ``www-releases`` repo, someone with admin + access must log in to ``prereleases-origin.llvm.org`` and manually pull the new + changes into ``/data/www-releases/``. This is where the website is served from. -#. Finally checkout the llvm-www repo and update the main page +#. Finally, check out the ``llvm-www`` repo and update the main page (``index.html`` and sidebar) to point to the new release and release announcement. @@ -414,5 +414,5 @@ using this command and add it to the post. $ git log --format="- %aN: [%s (%h)](https://github.com/llvm/llvm-project/commit/%H)" llvmorg-X.1.N-1..llvmorg-X.1.N -Once the release has been announced add a link to the announcement on the llvm -homepage (from the llvm-www repo) in the "Release Emails" section. +Once the release has been announced, add a link to the announcement on the llvm +homepage (from the ``llvm-www`` repo) in the "Release Emails" section. diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 8b6c25c..5b4b53d 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -7517,12 +7517,12 @@ sections that the user does not want removed after linking. '``unpredictable``' Metadata ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -``unpredictable`` metadata may be attached to any branch or switch -instruction. It can be used to express the unpredictability of control -flow. Similar to the ``llvm.expect`` intrinsic, it may be used to alter -optimizations related to compare and branch instructions. The metadata -is treated as a boolean value; if it exists, it signals that the branch -or switch that it is attached to is completely unpredictable. +``unpredictable`` metadata may be attached to any branch, select, or switch +instruction. It can be used to express the unpredictability of control flow. +Similar to the ``llvm.expect`` intrinsic, it may be used to alter optimizations +related to compare and branch instructions. The metadata is treated as a +boolean value; if it exists, it signals that the branch, select, or switch that +it is attached to is completely unpredictable. .. _md_dereferenceable: @@ -21062,33 +21062,36 @@ integer element type. Syntax: """"""" -This is an overloaded intrinsic. +This is an overloaded intrinsic. You can use ``llvm.matrix.column.major.load`` +to load any vector type with a stride of any bitwidth up to 64. :: - declare vectorty @llvm.matrix.column.major.load.*( + declare <4 x i32> @llvm.matrix.column.major.load.v4i32.i64( ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) + declare <9 x double> @llvm.matrix.column.major.load.v9f64.i32( + ptrty %Ptr, i32 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) Overview: """"""""" The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>`` matrix using a stride of ``%Stride`` to compute the start address of the -different columns. The offset is computed using ``%Stride``'s bitwidth. This -allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the -intrinsic is considered a :ref:`volatile memory access <volatile>`. The result -matrix is returned in the result vector. If the ``%Ptr`` argument is known to -be aligned to some boundary, this can be specified as an attribute on the -argument. +different columns. This allows for convenient loading of sub matrixes. +Independent of ``%Stride``'s bitwidth, the offset is computed using the target +daya layout's pointer index type. If ``<IsVolatile>`` is true, the intrinsic is +considered a :ref:`volatile memory access <volatile>`. The result matrix is +returned in the result vector. If the ``%Ptr`` argument is known to be aligned +to some boundary, this can be specified as an attribute on the argument. Arguments: """""""""" The first argument ``%Ptr`` is a pointer type to the returned vector type, and corresponds to the start address to load from. The second argument ``%Stride`` -is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used -to compute the column memory addresses. I.e., for a column ``C``, its start -memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument +is a positive integer for which ``%Stride >= <Rows>``. ``%Stride`` is used to +compute the column memory addresses. I.e., for a column ``C``, its start memory +addresses is calculated with ``%Ptr + C * %Stride``. The third Argument ``<IsVolatile>`` is a boolean value. The fourth and fifth arguments, ``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns, respectively, and must be positive, constant integers. The returned vector must @@ -21103,20 +21106,26 @@ The :ref:`align <attr_align>` parameter attribute can be provided for the Syntax: """"""" +This is an overloaded intrinsic. ``llvm.matrix.column.major.store`` to store +any vector type with a stride of any bitwidth up to 64. :: - declare void @llvm.matrix.column.major.store.*( - vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>) + declare void @llvm.matrix.column.major.store.v4i32.i64( + <4 x i32> %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, + i32 <Cols>) + declare void @llvm.matrix.column.major.store.v9f64.i32( + <9 x double> %In, ptrty %Ptr, i32 %Stride, i1 <IsVolatile>, i32 + <Rows>, i32 <Cols>) Overview: """"""""" The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x <Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between -columns. The offset is computed using ``%Stride``'s bitwidth. If -``<IsVolatile>`` is true, the intrinsic is considered a -:ref:`volatile memory access <volatile>`. +columns. Independent of ``%Stride``'s bitwidth, the offset is computed using +the target daya layout's pointer index type. If ``<IsVolatile>`` is true, the +intrinsic is considered a :ref:`volatile memory access <volatile>`. If the ``%Ptr`` argument is known to be aligned to some boundary, this can be specified as an attribute on the argument. @@ -21127,7 +21136,7 @@ Arguments: The first argument ``%In`` is a vector that corresponds to a ``<Rows> x <Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a pointer to the vector type of ``%In``, and is the start address of the matrix -in memory. The third argument ``%Stride`` is a positive, constant integer with +in memory. The third argument ``%Stride`` is a positive integer for which ``%Stride >= <Rows>``. ``%Stride`` is used to compute the column memory addresses. I.e., for a column ``C``, its start memory addresses is calculated with ``%Ptr + C * %Stride``. The fourth argument ``<IsVolatile>`` is a boolean diff --git a/llvm/docs/MergeFunctions.rst b/llvm/docs/MergeFunctions.rst index c27f603..d43b9c3 100644 --- a/llvm/docs/MergeFunctions.rst +++ b/llvm/docs/MergeFunctions.rst @@ -9,13 +9,13 @@ Introduction ============ Sometimes code contains equal functions, or functions that do exactly the same thing even though they are non-equal on the IR level (e.g.: multiplication on 2 -and 'shl 1'). It could happen due to several reasons: mainly, the usage of +and 'shl 1'). This can happen for several reasons: mainly, the usage of templates and automatic code generators. Though, sometimes the user itself could write the same thing twice :-) The main purpose of this pass is to recognize such functions and merge them. -This document is the extension to pass comments and describes the pass logic. It +This document is an extension to pass comments and describes the pass logic. It describes the algorithm used to compare functions and explains how we could combine equal functions correctly to keep the module valid. @@ -54,7 +54,7 @@ As a good starting point, the Kaleidoscope tutorial can be used: :doc:`tutorial/index` -It's especially important to understand chapter 3 of tutorial: +It's especially important to understand Chapter 3 of the tutorial: :doc:`tutorial/LangImpl03` @@ -314,7 +314,7 @@ list is immaterial. Our walk starts at the entry block for both functions, then takes each block from each terminator in order. As an artifact, this also means that unreachable blocks are ignored.” -So, using this walk we get BBs from *left* and *right* in the same order, and +So, using this walk, we get BBs from *left* and *right* in the same order, and compare them by “``FunctionComparator::compare(const BasicBlock*, const BasicBlock*)``” method. @@ -325,17 +325,17 @@ FunctionComparator::cmpType --------------------------- Consider how type comparison works. -1. Coerce pointer to integer. If left type is a pointer, try to coerce it to the +1. Coerce pointer to integer. If the left type is a pointer, try to coerce it to the integer type. It could be done if its address space is 0, or if address spaces are ignored at all. Do the same thing for the right type. -2. If left and right types are equal, return 0. Otherwise we need to give +2. If the left and right types are equal, return 0. Otherwise, we need to give preference to one of them. So proceed to the next step. -3. If types are of different kind (different type IDs). Return result of type +3. If the types are of different kind (different type IDs). Return result of type IDs comparison, treating them as numbers (use ``cmpNumbers`` operation). -4. If types are vectors or integers, return result of their pointers comparison, +4. If the types are vectors or integers, return result of their pointers comparison, comparing them as numbers. 5. Check whether type ID belongs to the next group (call it equivalent-group): @@ -391,7 +391,7 @@ equal to the corresponding part of *right* place, and (!) both parts use So, now our conclusion depends on *Value* instances comparison. -The main purpose of this method is to determine relation between such values. +The main purpose of this method is to determine the relation between such values. What can we expect from equal functions? At the same place, in functions "*FL*" and "*FR*" we expect to see *equal* values, or values *defined* at @@ -453,17 +453,17 @@ maps (one for the left side, another one for the right side): ``map<Value, int> sn_mapL, sn_mapR;`` -The key of the map is the *Value* itself, the *value* – is its order (call it +The key of the map is the *Value* itself; the *value* – is its order (call it *serial number*). To add value *V* we need to perform the next procedure: ``sn_map.insert(std::make_pair(V, sn_map.size()));`` -For the first *Value*, map will return *0*, for the second *Value* map will +For the first *Value*, the map will return *0*, for the second *Value*, the map will return *1*, and so on. -We can then check whether left and right values met at the same time with +We can then check whether the left and right values met at the same time with a simple comparison: ``cmpNumbers(sn_mapL[Left], sn_mapR[Right]);`` @@ -525,7 +525,7 @@ and finish comparison procedure. cmpConstants ------------ -Performs constants comparison as follows: +Performs a constant comparison as follows: 1. Compare constant types using ``cmpType`` method. If the result is -1 or 1, goto step 2, otherwise proceed to step 3. @@ -655,10 +655,10 @@ O(N*N) to O(log(N)). Merging process, mergeTwoFunctions ================================== Once *MergeFunctions* detects that current function (*G*) is equal to one that -were analyzed before (function *F*) it calls ``mergeTwoFunctions(Function*, +was analyzed before (function *F*) it calls ``mergeTwoFunctions(Function*, Function*)``. -Operation affects ``FnTree`` contents with next way: *F* will stay in +Operation affects ``FnTree`` contents in the following way: *F* will stay in ``FnTree``. *G* being equal to *F* will not be added to ``FnTree``. Calls of *G* would be replaced with something else. It changes bodies of callers. So, functions that calls *G* would be put into ``Deferred`` set and removed from @@ -692,8 +692,8 @@ ok: we can use alias to *F* instead of *G* or change call instructions itself. HasGlobalAliases, removeUsers ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ First, consider the case when we have global aliases of one function name to -another. Our purpose is make both of them with aliases to the third strong -function. Though if we keep *F* alive and without major changes we can leave it +another. Our purpose is to make both of them with aliases to the third strong +function. However, if we keep *F* alive and without major changes, we can leave it in ``FnTree``. Try to combine these two goals. Do a stub replacement of *F* itself with an alias to *F*. @@ -701,10 +701,10 @@ Do a stub replacement of *F* itself with an alias to *F*. 1. Create stub function *H*, with the same name and attributes like function *F*. It takes maximum alignment of *F* and *G*. -2. Replace all uses of function *F* with uses of function *H*. It is the two -steps procedure instead. First of all, we must take into account, all functions -from whom *F* is called would be changed: since we change the call argument -(from *F* to *H*). If so we must to review these caller functions again after +2. Replace all uses of function *F* with uses of function *H*. It is a +two-step procedure instead. First of all, we must take into account that all functions +that call *F* would be changed because we change the call argument +(from *F* to *H*). If so, we must review these caller functions again after this procedure. We remove callers from ``FnTree``, method with name ``removeUsers(F)`` does that (don't confuse with ``replaceAllUsesWith``): @@ -735,7 +735,7 @@ If “F” could not be overridden, fix it! """"""""""""""""""""""""""""""""""""""" We call ``writeThunkOrAlias(Function *F, Function *G)``. Here we try to replace -*G* with alias to *F* first. The next conditions are essential: +*G* with an alias to *F* first. The next conditions are essential: * target should support global aliases, * the address itself of *G* should be not significant, not named and not @@ -775,7 +775,7 @@ with bitcast(F). Deletes G.” In general it does the same as usual when we want to replace callee, except the first point: -1. We generate tail call wrapper around *F*, but with interface that allows use +1. We generate tail call wrapper around *F*, but with an interface that allows using it instead of *G*. 2. “As-usual”: ``removeUsers`` and ``replaceAllUsesWith`` then. diff --git a/llvm/docs/QualGroup.rst b/llvm/docs/QualGroup.rst index b45f569..5c05e4e 100644 --- a/llvm/docs/QualGroup.rst +++ b/llvm/docs/QualGroup.rst @@ -75,6 +75,16 @@ They meet the criteria for inclusion below. Knowing their handles help us keep t - capitan-davide - capitan_davide - capitan-davide + * - Jorge Pinto Sousa + - Critical Techworks + - sousajo-cc + - sousajo-cc + - sousajo-cc + * - José Rui Simões + - Critical Software + - jr-simoes + - jr_simoes + - iznogoud-zz * - Oscar Slotosch - Validas - slotosch @@ -100,6 +110,11 @@ They meet the criteria for inclusion below. Knowing their handles help us keep t - YoungJunLee - YoungJunLee - IamYJLee + * - Zaky Hermawan + - No Affiliation + - ZakyHermawan + - quarkz99 + - zakyHermawan Organizations are limited to three representatives within the group to maintain diversity. diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md index 79d93d0..9cdd983 100644 --- a/llvm/docs/ReleaseNotes.md +++ b/llvm/docs/ReleaseNotes.md @@ -134,9 +134,14 @@ Changes to the WebAssembly Backend Changes to the Windows Target ----------------------------- +* `-fpseudo-probe-for-profiling` is now supported for COFF. + Changes to the X86 Backend -------------------------- +* `-mcpu=wildcatlake` is now supported. +* `-mcpu=novalake` is now supported. + Changes to the OCaml bindings ----------------------------- @@ -147,6 +152,7 @@ Changes to the C API -------------------- * Add `LLVMGetOrInsertFunction` to get or insert a function, replacing the combination of `LLVMGetNamedFunction` and `LLVMAddFunction`. +* Allow `LLVMGetVolatile` to work with any kind of Instruction. Changes to the CodeGen infrastructure ------------------------------------- @@ -160,6 +166,8 @@ Changes to the Debug Info Changes to the LLVM tools --------------------------------- +* `llvm-profgen` now supports decoding pseudo probe for COFF binaries. + * `llvm-readelf` now dumps all hex format values in lower-case mode. * Some code paths for supporting Python 2.7 in `llvm-lit` have been removed. * Support for `%T` in lit has been removed. @@ -169,6 +177,12 @@ Changes to LLDB * LLDB can now set breakpoints, show backtraces, and display variables when debugging Wasm with supported runtimes (WAMR and V8). +* LLDB no longer stops processes by default when receiving SIGWINCH signals + (window resize events) on Linux. This is the default on other Unix platforms. + You can re-enable it using `process handle --notify=true --stop=true SIGWINCH`. +* The `show-progress` setting, which became a NOOP with the introduction of the + statusline, now defaults to off and controls using OSC escape codes to show a + native progress bar in supporting terminals like Ghostty and ConEmu. Changes to BOLT --------------------------------- diff --git a/llvm/docs/SPIRVUsage.rst b/llvm/docs/SPIRVUsage.rst index d2d6646..85eeabf 100644 --- a/llvm/docs/SPIRVUsage.rst +++ b/llvm/docs/SPIRVUsage.rst @@ -235,6 +235,8 @@ Below is a list of supported SPIR-V extensions, sorted alphabetically by their e - Adds execution modes and decorations to control floating-point computations in both kernels and shaders. It can be used on whole modules and individual instructions. * - ``SPV_INTEL_predicated_io`` - Adds predicated load and store instructions that conditionally read from or write to memory based on a boolean predicate. + * - ``SPV_KHR_maximal_reconvergence`` + - Adds execution mode and capability to enable maximal reconvergence. SPIR-V representation in LLVM IR ================================ diff --git a/llvm/docs/TableGen/BackEnds.rst b/llvm/docs/TableGen/BackEnds.rst index 14232bc..7f57137 100644 --- a/llvm/docs/TableGen/BackEnds.rst +++ b/llvm/docs/TableGen/BackEnds.rst @@ -48,7 +48,7 @@ the TableGen files, the back-ends and their users. For instance, a global contract is that each back-end produces macro-guarded sections. Based on whether the file is included by a header or a source file, or even in which context of each file the include is being used, you have -todefine a macro just before including it, to get the right output: +to define a macro just before including it, to get the right output: .. code-block:: c++ @@ -80,8 +80,8 @@ in the TableGen files. CodeEmitter ----------- -**Purpose**: CodeEmitterGen uses the descriptions of instructions and their fields to -construct an automated code emitter: a function that, given a MachineInstr, +**Purpose**: ``CodeEmitterGen`` uses the descriptions of instructions and their fields to +construct an automated code emitter: a function that, given a ``MachineInstr``, returns the (currently, 32-bit unsigned) value of the instruction. **Output**: C++ code, implementing the target's CodeEmitter @@ -130,7 +130,7 @@ AsmMatcher ---------- **Purpose**: Emits a target specifier matcher for -converting parsed assembly operands in the MCInst structures. It also +converting parsed assembly operands in the ``MCInst`` structures. It also emits a matcher for custom operand parsing. Extensive documentation is written on the ``AsmMatcherEmitter.cpp`` file. @@ -167,7 +167,7 @@ CallingConv conventions supported by this target. **Output**: Implement static functions to deal with calling conventions -chained by matching styles, returning false on no match. +chained by matching styles, returning ``false`` on no match. **Usage**: Used in ISelLowering and FastIsel as function pointers to implementation returned by a CC selection function. @@ -200,7 +200,7 @@ FastISel **Purpose**: This tablegen backend emits code for use by the "fast" instruction selection algorithm. See the comments at the top of -lib/CodeGen/SelectionDAG/FastISel.cpp for background. This file +``lib/CodeGen/SelectionDAG/FastISel.cpp`` for background. This file scans through the target's tablegen instruction-info files and extracts instructions with obvious-looking patterns, and it emits code to look up these instructions by type and operator. @@ -270,23 +270,23 @@ This file is included as part of ``Attr.h``. ClangAttrParserStringSwitches ----------------------------- -**Purpose**: Creates AttrParserStringSwitches.inc, which contains -StringSwitch::Case statements for parser-related string switches. Each switch +**Purpose**: Creates ``AttrParserStringSwitches.inc``, which contains +``StringSwitch::Case`` statements for parser-related string switches. Each switch is given its own macro (such as ``CLANG_ATTR_ARG_CONTEXT_LIST``, or ``CLANG_ATTR_IDENTIFIER_ARG_LIST``), which is expected to be defined before -including AttrParserStringSwitches.inc, and undefined after. +including ``AttrParserStringSwitches.inc``, and undefined after. ClangAttrImpl ------------- -**Purpose**: Creates AttrImpl.inc, which contains semantic attribute class +**Purpose**: Creates ``AttrImpl.inc``, which contains semantic attribute class definitions for any attribute in ``Attr.td`` that has not set ``ASTNode = 0``. This file is included as part of ``AttrImpl.cpp``. ClangAttrList ------------- -**Purpose**: Creates AttrList.inc, which is used when a list of semantic +**Purpose**: Creates ``AttrList.inc``, which is used when a list of semantic attribute identifiers is required. For instance, ``AttrKinds.h`` includes this file to generate the list of ``attr::Kind`` enumeration values. This list is separated out into multiple categories: attributes, inheritable attributes, and @@ -297,25 +297,25 @@ functionality required for ``dyn_cast`` and similar APIs. ClangAttrPCHRead ---------------- -**Purpose**: Creates AttrPCHRead.inc, which is used to deserialize attributes +**Purpose**: Creates ``AttrPCHRead.inc``, which is used to deserialize attributes in the ``ASTReader::ReadAttributes`` function. ClangAttrPCHWrite ----------------- -**Purpose**: Creates AttrPCHWrite.inc, which is used to serialize attributes in +**Purpose**: Creates ``AttrPCHWrite.inc``, which is used to serialize attributes in the ``ASTWriter::WriteAttributes`` function. ClangAttrSpellings --------------------- -**Purpose**: Creates AttrSpellings.inc, which is used to implement the +**Purpose**: Creates ``AttrSpellings.inc``, which is used to implement the ``__has_attribute`` feature test macro. ClangAttrSpellingListIndex -------------------------- -**Purpose**: Creates AttrSpellingListIndex.inc, which is used to map parsed +**Purpose**: Creates ``AttrSpellingListIndex.inc``, which is used to map parsed attribute spellings (including which syntax or scope was used) to an attribute spelling list index. These spelling list index values are internal implementation details exposed via @@ -324,26 +324,26 @@ implementation details exposed via ClangAttrVisitor ------------------- -**Purpose**: Creates AttrVisitor.inc, which is used when implementing +**Purpose**: Creates ``AttrVisitor.inc``, which is used when implementing recursive AST visitors. ClangAttrTemplateInstantiate ---------------------------- -**Purpose**: Creates AttrTemplateInstantiate.inc, which implements the +**Purpose**: Creates ``AttrTemplateInstantiate.inc``, which implements the ``instantiateTemplateAttribute`` function, used when instantiating a template that requires an attribute to be cloned. ClangAttrParsedAttrList ----------------------- -**Purpose**: Creates AttrParsedAttrList.inc, which is used to generate the +**Purpose**: Creates ``AttrParsedAttrList.inc``, which is used to generate the ``AttributeList::Kind`` parsed attribute enumeration. ClangAttrParsedAttrImpl ----------------------- -**Purpose**: Creates AttrParsedAttrImpl.inc, which is used by +**Purpose**: Creates ``AttrParsedAttrImpl.inc``, which is used by ``AttributeList.cpp`` to implement several functions on the ``AttributeList`` class. This functionality is implemented via the ``AttrInfoMap ParsedAttrInfo`` array, which contains one element per parsed attribute object. @@ -351,14 +351,14 @@ array, which contains one element per parsed attribute object. ClangAttrParsedAttrKinds ------------------------ -**Purpose**: Creates AttrParsedAttrKinds.inc, which is used to implement the +**Purpose**: Creates ``AttrParsedAttrKinds.inc``, which is used to implement the ``AttributeList::getKind`` function, mapping a string (and syntax) to a parsed attribute ``AttributeList::Kind`` enumeration. ClangAttrDump ------------- -**Purpose**: Creates AttrDump.inc, which dumps information about an attribute. +**Purpose**: Creates ``AttrDump.inc``, which dumps information about an attribute. It is used to implement ``ASTDumper::dumpAttr``. ClangDiagsDefs @@ -424,7 +424,7 @@ Generate list of commands that are used in documentation comments. ArmNeon ------- -Generate arm_neon.h for clang. +Generate ``arm_neon.h`` for clang. ArmNeonSema ----------- @@ -473,7 +473,7 @@ to a built-in backend. **Output**: -The root of the output file is a JSON object (i.e. dictionary), +The root of the output file is a JSON object (i.e., dictionary), containing the following fixed keys: * ``!tablegen_json_version``: a numeric version field that will @@ -520,7 +520,7 @@ conventions described below. Some TableGen data types are translated directly into the corresponding JSON type: -* A completely undefined value (e.g. for a variable declared without +* A completely undefined value (e.g., for a variable declared without initializer in some superclass of this record, and never initialized by the record itself or any other superclass) is emitted as the JSON ``null`` value. @@ -964,7 +964,7 @@ Here is the modified lookup function. The new lookup function will return an iterator range with first pointer to the first result and the last pointer to the last matching result from the table. -However, please note that the support for emitting modified definition exists +However, please note that the support for emitting a modified definition exists for ``PrimaryKeyName`` only. The ``PrimaryKeyEarlyOut`` field, when set to 1, modifies the lookup |