16 files changed, 835 insertions, 419 deletions
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 5343d66..ef2a98f 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1771,6 +1771,10 @@ The AMDGPU backend supports the following LLVM IR attributes.
                                                       using dedicated instructions, but may not send the DEALLOC_VGPRS
                                                       message. If a shader has this attribute, then all its callees must
                                                       match its value.
+                                                      An amd_cs_chain CC function with this enabled has an extra symbol
+                                                      prefixed with "_dvgpr$" with the value of the function symbol,
+                                                      offset by one less than the number of dynamic VGPR blocks required
+                                                      by the function encoded in bits 5..3.
 
      ================================================ ==========================================================
 
@@ -5598,6 +5602,8 @@ The fields used by CP for code objects before V3 also match those specified in
                                                        roundup(lds-size / (128 * 4))
                                                      GFX950
                                                        roundup(lds-size / (320 * 4))
+                                                     GFX125*
+                                                       roundup(lds-size / (256 * 4))
 
      24      1 bit   ENABLE_EXCEPTION_IEEE_754_FP    Wavefront starts execution
                      _INVALID_OPERATION              with specified exceptions
diff --git a/llvm/docs/CMake.rst b/llvm/docs/CMake.rst
index 365365c..30b71bf 100644
--- a/llvm/docs/CMake.rst
+++ b/llvm/docs/CMake.rst
@@ -9,7 +9,7 @@ Introduction
 ============
 
 `CMake <http://www.cmake.org/>`_ is a cross-platform build-generator tool. CMake
-does not build the project, it generates the files needed by your build tool
+does not build the project; it generates the files needed by your build tool
 (GNU make, Visual Studio, etc.) for building LLVM.
 
 If **you are a new contributor**, please start with the :doc:`GettingStarted`
@@ -23,7 +23,7 @@ and then go back to the `Quick start`_ section once you know what you are doing.
 you already have experience with CMake, this is the recommended starting point.
 
 This page is geared towards users of the LLVM CMake build. If you're looking for
-information about modifying the LLVM CMake build system you may want to see the
+information about modifying the LLVM CMake build system, you may want to see the
 :doc:`CMakePrimer` page. It has a basic overview of the CMake language.
 
 .. _Quick start:
@@ -37,7 +37,7 @@ We use here the command-line, non-interactive CMake interface.
    CMake. Version 3.20.0 is the minimum required.
 
 #. Open a shell. Your development tools must be reachable from this shell
-   through the PATH environment variable.
+   through the ``PATH`` environment variable.
 
 #. Create a build directory. Building LLVM in the source
    directory is not supported. cd to this directory:
@@ -70,7 +70,7 @@ We use here the command-line, non-interactive CMake interface.
    components are built; see the `Frequently Used LLVM-related
    variables`_ below.
 
-#. After CMake has finished running, proceed to use IDE project files, or start
+#. After CMake has finished running, use IDE project files, or start
    the build from the build directory:
 
    .. code-block:: console
@@ -106,8 +106,7 @@ We use here the command-line, non-interactive CMake interface.
 Basic CMake usage
 =================
 
-This section explains basic aspects of CMake
-which you may need in your day-to-day usage.
+This section explains basic aspects of CMake for daily use.
 
 CMake comes with extensive documentation, in the form of html files, and as
 online help accessible via the ``cmake`` executable itself. Execute ``cmake
@@ -115,11 +114,11 @@ online help accessible via the ``cmake`` executable itself. Execute ``cmake
 
 CMake allows you to specify a build tool (e.g., GNU make, Visual Studio,
 or Xcode). If not specified on the command line, CMake tries to guess which
-build tool to use, based on your environment. Once it has identified your
+build tool to use based on your environment. Once it has identified your
 build tool, CMake uses the corresponding *Generator* to create files for your
 build tool (e.g., Makefiles or Visual Studio or Xcode project files). You can
 explicitly specify the generator with the command line option ``-G "Name of the
-generator"``. To see a list of the available generators on your system, execute
+generator"``. To see a list of the available generators on your system, execute:
 
 .. code-block:: console
 
@@ -127,7 +126,7 @@ generator"``. To see a list of the available generators on your system, execute
 
 This will list the generator names at the end of the help text.
 
-Generators' names are case-sensitive, and may contain spaces. For this reason,
+Generators' names are case-sensitive and may contain spaces. For this reason,
 you should enter them exactly as they are listed in the ``cmake --help``
 output, in quotes. For example, to generate project files specifically for
 Visual Studio 12, you can execute:
@@ -136,7 +135,7 @@ Visual Studio 12, you can execute:
 
   $ cmake -G "Visual Studio 12" path/to/llvm/source/root
 
-For a given development platform there can be more than one adequate
+A given development platform can have more than one adequate
 generator. If you use Visual Studio, "NMake Makefiles" is a generator you can use
 for building with NMake. By default, CMake chooses the most specific generator
 supported by your development environment. If you want an alternative generator,
@@ -205,17 +204,17 @@ used variables that control features of LLVM and enabled subprojects.
   **MinSizeRel**              For Size      No         No         When disk space matters
   =========================== ============= ========== ========== ==========================
 
-  * Optimizations make LLVM/Clang run faster, but can be an impediment for
+  * Optimizations make LLVM/Clang run faster but can be an impediment for
     step-by-step debugging.
   * Builds with debug information can use a lot of RAM and disk space and is
     usually slower to run. You can improve RAM usage by using ``lld``, see
     the :ref:`LLVM_USE_LINKER <llvm_use_linker>` option.
   * Assertions are internal checks to help you find bugs. They typically slow
-    down LLVM and Clang when enabled, but can be useful during development.
+    down LLVM and Clang when enabled but can be useful during development.
     You can manually set :ref:`LLVM_ENABLE_ASSERTIONS <llvm_enable_assertions>`
     to override the default from `CMAKE_BUILD_TYPE`.
 
-  If you are using an IDE such as Visual Studio or Xcode, you should use
+  If you are using an IDE such as Visual Studio or Xcode, use
   the IDE settings to set the build type.
 
   Note: on Windows (building with MSVC or clang-cl), CMake's **RelWithDebInfo**
@@ -244,11 +243,11 @@ LLVM variables that are frequently used to control that. The full
 description is in `LLVM-related variables`_ below.
 
 **LLVM_ENABLE_PROJECTS**:STRING
-  Control which projects are enabled. For example you may want to work on clang
+  Control which projects are enabled. For example, you may want to work on clang
   or lldb by specifying ``-DLLVM_ENABLE_PROJECTS="clang;lldb"``.
 
 **LLVM_ENABLE_RUNTIMES**:STRING
-  Control which runtimes are enabled. For example you may want to work on
+  Control which runtimes are enabled. For example, you may want to work on
   libc++ or libc++abi by specifying ``-DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi"``.
 
 **LLVM_LIBDIR_SUFFIX**:STRING
@@ -264,13 +263,13 @@ description is in `LLVM-related variables`_ below.
   32GB machine, specify ``-G Ninja -DLLVM_PARALLEL_LINK_JOBS=2``.
 
 **LLVM_TARGETS_TO_BUILD**:STRING
-  Control which targets are enabled. For example you may only need to enable
+  Control which targets are enabled. For example, you may only need to enable
   your native target with, for example, ``-DLLVM_TARGETS_TO_BUILD=X86``.
 
 .. _llvm_use_linker:
 
 **LLVM_USE_LINKER**:STRING
-  Override the system's default linker. For instance use ``lld`` with
+  Override the system's default linker. For instance, use ``lld`` with
   ``-DLLVM_USE_LINKER=lld``.
 
 Rarely-used CMake variables
@@ -282,7 +281,7 @@ manual, or execute ``cmake --help-variable VARIABLE_NAME``.
 
 **CMAKE_CXX_STANDARD**:STRING
   Sets the C++ standard to conform to when building LLVM.  Possible values are
-  17 and 20.  LLVM Requires C++17 or higher.  This defaults to 17.
+  17 and 20.  LLVM requires C++17 or higher.  This defaults to 17.
 
 **CMAKE_INSTALL_BINDIR**:PATH
   The path to install executables, relative to the *CMAKE_INSTALL_PREFIX*.
@@ -306,7 +305,7 @@ LLVM-related variables
 -----------------------
 
 These variables provide fine control over the build of LLVM and
-enabled sub-projects. Nearly all of these variable names begin with
+its enabled sub-projects. Nearly all of these variable names begin with
 ``LLVM_``.
 
 .. _LLVM-related variables BUILD_SHARED_LIBS:
@@ -317,7 +316,7 @@ enabled sub-projects. Nearly all of these variable names begin with
   Windows, shared libraries may be used when building with MinGW, including
   mingw-w64, but not when building with the Microsoft toolchain.
 
-  .. note:: BUILD_SHARED_LIBS is only recommended for use by LLVM developers.
+  .. note:: ``BUILD_SHARED_LIBS`` is only recommended for use by LLVM developers.
             If you want to build LLVM as a shared library, you should use the
             ``LLVM_BUILD_LLVM_DYLIB`` option.
 
@@ -332,7 +331,7 @@ enabled sub-projects. Nearly all of these variable names begin with
 
 **LLVM_ADDITIONAL_BUILD_TYPES**:LIST
   Adding a semicolon separated list of additional build types to this flag
-  allows for them to be specified as values in CMAKE_BUILD_TYPE without
+  allows for them to be specified as values in ``CMAKE_BUILD_TYPE`` without
   encountering a fatal error during the configuration process.
 
 **LLVM_APPEND_VC_REV**:BOOL
@@ -359,12 +358,12 @@ enabled sub-projects. Nearly all of these variable names begin with
   Adds benchmarks to the list of default targets. Defaults to OFF.
 
 **LLVM_BUILD_DOCS**:BOOL
-  Adds all *enabled* documentation targets (i.e. Doxgyen and Sphinx targets) as
+  Adds all *enabled* documentation targets (i.e. Doxygen and Sphinx targets) as
   dependencies of the default build targets.  This results in all of the (enabled)
   documentation targets being as part of a normal build.  If the ``install``
-  target is run then this also enables all built documentation targets to be
+  target is run, then this also enables all built documentation targets to be
   installed. Defaults to OFF.  To enable a particular documentation target, see
-  LLVM_ENABLE_SPHINX and LLVM_ENABLE_DOXYGEN.
+  ``LLVM_ENABLE_SPHINX`` and ``LLVM_ENABLE_DOXYGEN``.
 
 **LLVM_BUILD_EXAMPLES**:BOOL
   Build LLVM examples. Defaults to OFF. Targets for building each example are
@@ -375,7 +374,7 @@ enabled sub-projects. Nearly all of these variable names begin with
   If enabled, `source-based code coverage
   <https://clang.llvm.org/docs/SourceBasedCodeCoverage.html>`_ instrumentation
   is enabled while building llvm. If CMake can locate the code coverage
-  scripts and the llvm-cov and llvm-profdata tools that pair to your compiler,
+  scripts and the llvm-cov and llvm-profdata tools that pair with your compiler,
   the build will also generate the `generate-coverage-report` target to generate
   the code coverage report for LLVM, and the `clear-profile-data` utility target
   to delete captured profile data. See documentation for
@@ -385,10 +384,10 @@ enabled sub-projects. Nearly all of these variable names begin with
 **LLVM_BUILD_LLVM_DYLIB**:BOOL
   If enabled, the target for building the libLLVM shared library is added.
   This library contains all of LLVM's components in a single shared library.
-  Defaults to OFF. This cannot be used in conjunction with BUILD_SHARED_LIBS.
-  Tools will only be linked to the libLLVM shared library if LLVM_LINK_LLVM_DYLIB
+  Defaults to OFF. This cannot be used in conjunction with ``BUILD_SHARED_LIBS``.
+  Tools will only be linked to the libLLVM shared library if ``LLVM_LINK_LLVM_DYLIB``
   is also ON.
-  The components in the library can be customised by setting LLVM_DYLIB_COMPONENTS
+  The components in the library can be customised by setting ``LLVM_DYLIB_COMPONENTS``
   to a list of the desired components.
   This option is not available on Windows.
 
@@ -410,8 +409,8 @@ enabled sub-projects. Nearly all of these variable names begin with
   If enabled and the ``ccache`` program is available, then LLVM will be
   built using ``ccache`` to speed up rebuilds of LLVM and its components.
   Defaults to OFF.  The size and location of the cache maintained
-  by ``ccache`` can be adjusted via the LLVM_CCACHE_MAXSIZE and LLVM_CCACHE_DIR
-  options, which are passed to the CCACHE_MAXSIZE and CCACHE_DIR environment
+  by ``ccache`` can be adjusted via the ``LLVM_CCACHE_MAXSIZE`` and ``LLVM_CCACHE_DIR``
+  options, which are passed to the ``CCACHE_MAXSIZE`` and ``CCACHE_DIR`` environment
   variables, respectively.
 
 **LLVM_CODE_COVERAGE_TARGETS**:STRING
@@ -425,9 +424,9 @@ enabled sub-projects. Nearly all of these variable names begin with
   coverage reports will include all sources identified by the tooling.
 
 **LLVM_CREATE_XCODE_TOOLCHAIN**:BOOL
-  macOS Only: If enabled CMake will generate a target named
+  macOS Only: If enabled, CMake will generate a target named
   'install-xcode-toolchain'. This target will create a directory at
-  $CMAKE_INSTALL_PREFIX/Toolchains containing an xctoolchain directory which can
+  ``$CMAKE_INSTALL_PREFIX/Toolchains`` containing an xctoolchain directory which can
   be used to override the default system tools.
 
 **LLVM_DEFAULT_TARGET_TRIPLE**:STRING
@@ -519,8 +518,8 @@ enabled sub-projects. Nearly all of these variable names begin with
   Indicates whether the LLVM Interpreter will be linked with the Foreign Function
   Interface library (libffi) in order to enable calling external functions.
   If the library or its headers are installed in a custom
-  location, you can also set the variables FFI_INCLUDE_DIR and
-  FFI_LIBRARY_DIR to the directories where ffi.h and libffi.so can be found,
+  location, you can also set the variables ``FFI_INCLUDE_DIR`` and
+  ``FFI_LIBRARY_DIR`` to the directories where ``ffi.h`` and ``libffi.so`` can be found,
   respectively. Defaults to OFF.
 
 **LLVM_ENABLE_HTTPLIB**:BOOL
@@ -536,7 +535,7 @@ enabled sub-projects. Nearly all of these variable names begin with
   configured manually to explicitly control the generation of those targets.
 
 **LLVM_ENABLE_LIBCXX**:BOOL
-  If the host compiler and linker supports the stdlib flag, -stdlib=libc++ is
+  If the host compiler and linker support the stdlib flag, ``-stdlib=libc++`` is
   passed to invocations of both so that the project is built using libc++
   instead of stdlibc++. Defaults to OFF.
 
@@ -643,7 +642,7 @@ enabled sub-projects. Nearly all of these variable names begin with
 
 **LLVM_ENABLE_Z3_SOLVER**:BOOL
   If enabled, the Z3 constraint solver is activated for the Clang static analyzer.
-  A recent version of the z3 library needs to be available on the system.
+  A recent version of the z3 library must be available on the system.
 
 **LLVM_ENABLE_ZLIB**:STRING
   Used to decide if LLVM tools should support compression/decompression with
@@ -672,7 +671,7 @@ enabled sub-projects. Nearly all of these variable names begin with
   These variables specify the path to the source directory for the external
   LLVM projects Clang, lld, and Polly, respectively, relative to the top-level
   source directory.  If the in-tree subdirectory for an external project
-  exists (e.g., llvm/tools/clang for Clang), then the corresponding variable
+  exists (e.g., ``llvm/tools/clang`` for Clang), then the corresponding variable
   will not be used.  If the variable for an external project does not point
   to a valid path, then that project will not be built.
 
@@ -722,17 +721,17 @@ enabled sub-projects. Nearly all of these variable names begin with
 
 **LLVM_INSTALL_OCAMLDOC_HTML_DIR**:STRING
   The path to install OCamldoc-generated HTML documentation to. This path can
-  either be absolute or relative to the CMAKE_INSTALL_PREFIX. Defaults to
+  either be absolute or relative to the ``CMAKE_INSTALL_PREFIX``. Defaults to
   ``${CMAKE_INSTALL_DOCDIR}/llvm/ocaml-html``.
 
 **LLVM_INSTALL_SPHINX_HTML_DIR**:STRING
   The path to install Sphinx-generated HTML documentation to. This path can
-  either be absolute or relative to the CMAKE_INSTALL_PREFIX. Defaults to
+  either be absolute or relative to the ``CMAKE_INSTALL_PREFIX``. Defaults to
   ``${CMAKE_INSTALL_DOCDIR}/llvm/html``.
 
 **LLVM_INSTALL_UTILS**:BOOL
   If enabled, utility binaries like ``FileCheck`` and ``not`` will be installed
-  to CMAKE_INSTALL_PREFIX.
+  to ``CMAKE_INSTALL_PREFIX``.
 
 **LLVM_INSTALL_DOXYGEN_HTML_DIR**:STRING
   The path to install Doxygen-generated HTML documentation to. This path can
@@ -753,7 +752,7 @@ enabled sub-projects. Nearly all of these variable names begin with
     $ D:\llvm-project> cmake ... -DLLVM_INTEGRATED_CRT_ALLOC=D:\git\rpmalloc
 
   This option needs to be used along with the static CRT, ie. if building the
-  Release target, add -DCMAKE_MSVC_RUNTIME_LIBRARY=MultiThreaded.
+  Release target, add ``-DCMAKE_MSVC_RUNTIME_LIBRARY=MultiThreaded``.
   Note that rpmalloc is also supported natively in-tree, see option below.
 
 **LLVM_ENABLE_RPMALLOC**:BOOL
@@ -764,7 +763,7 @@ enabled sub-projects. Nearly all of these variable names begin with
 
 **LLVM_LINK_LLVM_DYLIB**:BOOL
   If enabled, tools will be linked with the libLLVM shared library. Defaults
-  to OFF. Setting LLVM_LINK_LLVM_DYLIB to ON also sets LLVM_BUILD_LLVM_DYLIB
+  to OFF. Setting ``LLVM_LINK_LLVM_DYLIB`` to ON also sets ``LLVM_BUILD_LLVM_DYLIB``
   to ON.
   This option is not available on Windows.
 
@@ -779,8 +778,8 @@ enabled sub-projects. Nearly all of these variable names begin with
 **LLVM_LIT_TOOLS_DIR**:PATH
   The path to GnuWin32 tools for tests. Valid on Windows host.  Defaults to
   the empty string, in which case lit will look for tools needed for tests
-  (e.g. ``grep``, ``sort``, etc.) in your %PATH%. If GnuWin32 is not in your
-  %PATH%, then you can set this variable to the GnuWin32 directory so that
+  (e.g. ``grep``, ``sort``, etc.) in your ``%PATH%``. If GnuWin32 is not in your
+  ``%PATH%``, then you can set this variable to the GnuWin32 directory so that
   lit can find tools needed for tests in that directory.
 
 **LLVM_NATIVE_TOOL_DIR**:STRING
@@ -798,9 +797,9 @@ enabled sub-projects. Nearly all of these variable names begin with
   set to non-standard values.
 
 **LLVM_OPTIMIZED_TABLEGEN**:BOOL
-  If enabled and building a debug or asserts build the CMake build system will
+  If enabled and building a debug or asserts build, the CMake build system will
   generate a Release build tree to build a fully optimized tablegen for use
-  during the build. Enabling this option can significantly speed up build times
+  during the build. Enabling this option can significantly speed up build times,
   especially when building LLVM in Debug configurations.
 
 **LLVM_PARALLEL_{COMPILE,LINK,TABLEGEN}_JOBS**:STRING
@@ -809,7 +808,7 @@ enabled sub-projects. Nearly all of these variable names begin with
   determined by the number of logical CPUs.
 
 **LLVM_PROFDATA_FILE**:PATH
-  Path to a profdata file to pass into clang's -fprofile-instr-use flag. This
+  Path to a profdata file to pass into clang's ``-fprofile-instr-use`` flag. This
   can only be specified if you're building with clang.
 
 **LLVM_RAM_PER_{COMPILE,LINK,TABLEGEN}_JOB**:STRING
@@ -833,7 +832,7 @@ enabled sub-projects. Nearly all of these variable names begin with
 
 **LLVM_STATIC_LINK_CXX_STDLIB**:BOOL
   Statically link to the C++ standard library if possible. This uses the flag
-  "-static-libstdc++", but a Clang host compiler will statically link to libc++
+  ``-static-libstdc++``, but a Clang host compiler will statically link to libc++
   if used in conjunction with the **LLVM_ENABLE_LIBCXX** flag. Defaults to OFF.
 
 **LLVM_TABLEGEN**:STRING
@@ -851,8 +850,8 @@ enabled sub-projects. Nearly all of these variable names begin with
   Semicolon-separated list of targets to build, or *all* for building all
   targets. Case-sensitive. Defaults to *all*. Example:
   ``-DLLVM_TARGETS_TO_BUILD="X86;PowerPC"``.
-  The full list, as of March 2023, is:
-  ``AArch64;AMDGPU;ARM;AVR;BPF;Hexagon;Lanai;LoongArch;Mips;MSP430;NVPTX;PowerPC;RISCV;Sparc;SystemZ;VE;WebAssembly;X86;XCore``
+  The full list, as of August 2025, is:
+  ``AArch64;AMDGPU;ARM;AVR;BPF;Hexagon;Lanai;LoongArch;Mips;MSP430;NVPTX;PowerPC;RISCV;Sparc;SPIRV;SystemZ;VE;WebAssembly;X86;XCore``
 
   You can also specify ``host`` or ``Native`` to automatically detect and
   include the target corresponding to the host machine's architecture, or
@@ -870,16 +869,16 @@ enabled sub-projects. Nearly all of these variable names begin with
   the default set of UBSan flags.
 
 **LLVM_UNREACHABLE_OPTIMIZE**:BOOL
-  This flag controls the behavior of `llvm_unreachable()` in release build
+  This flag controls the behavior of ``llvm_unreachable()`` in release build
   (when assertions are disabled in general). When ON (default) then
-  `llvm_unreachable()` is considered "undefined behavior" and optimized as
+  ``llvm_unreachable()`` is considered "undefined behavior" and optimized as
   such. When OFF it is instead replaced with a guaranteed "trap".
 
 **LLVM_USE_INTEL_JITEVENTS**:BOOL
   Enable building support for Intel JIT Events API. Defaults to OFF.
 
 **LLVM_USE_LINKER**:STRING
-  Add ``-fuse-ld={name}`` to the link invocation. The possible value depend on
+  Add ``-fuse-ld={name}`` to the link invocation. The possible values depend on
   your compiler, for clang the value can be an absolute path to your custom
   linker, otherwise clang will prefix the name with ``ld.`` and apply its usual
   search. For example to link LLVM with the Gold linker, cmake can be invoked
@@ -929,7 +928,7 @@ enabled sub-projects. Nearly all of these variable names begin with
   to ON.
 
 **SPHINX_WARNINGS_AS_ERRORS**:BOOL
-  If enabled then sphinx documentation warnings will be treated as
+  If enabled, then sphinx documentation warnings will be treated as
   errors. Defaults to ON.
 
 Advanced variables
@@ -955,12 +954,12 @@ things to go wrong.  They are also unstable across LLVM versions.
 CMake Caches
 ============
 
-Recently LLVM and Clang have been adding some more complicated build system
+Recently, LLVM and Clang have been adding some more complicated build system
 features. Utilizing these new features often involves a complicated chain of
 CMake variables passed on the command line. Clang provides a collection of CMake
 cache scripts to make these features more approachable.
 
-CMake cache files are utilized using CMake's -C flag:
+CMake cache files are utilized using CMake's ``-C`` flag:
 
 .. code-block:: console
 
@@ -974,15 +973,15 @@ A few notes about CMake Caches:
 
 - Order of command line arguments is important
 
-  - -D arguments specified before -C are set before the cache is processed and
+  - ``-D`` arguments specified before -C are set before the cache is processed and
     can be read inside the cache file
-  - -D arguments specified after -C are set after the cache is processed and
+  - ``-D`` arguments specified after -C are set after the cache is processed and
     are unset inside the cache file
 
-- All -D arguments will override cache file settings
+- All ``-D`` arguments will override cache file settings
 - CMAKE_TOOLCHAIN_FILE is evaluated after both the cache file and the command
   line arguments
-- It is recommended that all -D options should be specified *before* -C
+- It is recommended that all ``-D`` options should be specified *before* -C
 
 For more information about some of the advanced build configurations supported
 via Cache files see :doc:`AdvancedBuilds`.
@@ -1055,7 +1054,7 @@ and uses them to build a simple application ``simple-tool``.
 
 The ``find_package(...)`` directive when used in CONFIG mode (as in the above
 example) will look for the ``LLVMConfig.cmake`` file in various locations (see
-cmake manual for details).  It creates a ``LLVM_DIR`` cache entry to save the
+cmake manual for details).  It creates an ``LLVM_DIR`` cache entry to save the
 directory where ``LLVMConfig.cmake`` is found or allows the user to specify the
 directory (e.g. by passing ``-DLLVM_DIR=/usr/lib/cmake/llvm`` to
 the ``cmake`` command or by setting it directly in ``ccmake`` or ``cmake-gui``).
@@ -1079,11 +1078,11 @@ or you wish to build directly against the LLVM build tree you can use
 ``LLVM_DIR`` as previously mentioned.
 
 The ``LLVMConfig.cmake`` file sets various useful variables. Notable variables
-include
+include:
 
 ``LLVM_CMAKE_DIR``
   The path to the LLVM CMake directory (i.e. the directory containing
-  LLVMConfig.cmake).
+  ``LLVMConfig.cmake``).
 
 ``LLVM_DEFINITIONS``
   A list of preprocessor defines that should be used when building against LLVM.
@@ -1123,7 +1122,7 @@ and will be removed in a future version of LLVM.
 Developing LLVM passes out of source
 ------------------------------------
 
-It is possible to develop LLVM passes out of LLVM's source tree (i.e. against an
+You can develop LLVM passes out of LLVM's source tree (i.e. against an
 installed or built LLVM). An example of a project layout is provided below.
 
 .. code-block:: none
diff --git a/llvm/docs/CommandGuide/lit.rst b/llvm/docs/CommandGuide/lit.rst
index eb90e95..15c249d 100644
--- a/llvm/docs/CommandGuide/lit.rst
+++ b/llvm/docs/CommandGuide/lit.rst
@@ -399,6 +399,11 @@ ADDITIONAL OPTIONS
  Show all features used in the test suite (in ``XFAIL``, ``UNSUPPORTED`` and
  ``REQUIRES``) and exit.
 
+.. option:: --update-tests
+
+ Pass failing tests to functions in the ``lit_config.test_updaters`` list to
+ check whether any of them know how to update the test to make it pass.
+
 EXIT STATUS
 -----------
 
diff --git a/llvm/docs/ContentAddressableStorage.md b/llvm/docs/ContentAddressableStorage.md
new file mode 100644
index 0000000..cd8d6cb
--- /dev/null
+++ b/llvm/docs/ContentAddressableStorage.md
@@ -0,0 +1,121 @@
+# Content Addressable Storage
+
+## Introduction to CAS
+
+Content Addressable Storage, or `CAS`, is a storage system that assigns
+unique addresses to the data stored. It is very useful for data deduplicaton
+and creating unique identifiers.
+
+Unlike other kinds of storage systems, like file systems, CAS is immutable. It
+is more reliable to model a computation by representing the inputs and outputs
+of the computation using objects stored in CAS.
+
+The basic unit of the CAS library is a CASObject, where it contains:
+
+* Data: arbitrary data
+* References: references to other CASObject
+
+It can be conceptually modeled as something like:
+
+```
+struct CASObject {
+  ArrayRef<char> Data;
+  ArrayRef<CASObject*> Refs;
+}
+```
+
+With this abstraction, it is possible to compose `CASObject`s into a DAG that is
+capable of representing complicated data structures, while still allowing data
+deduplication. Note you can compare two DAGs by just comparing the CASObject
+hash of two root nodes.
+
+
+## LLVM CAS Library User Guide
+
+The CAS-like storage provided in LLVM is `llvm::cas::ObjectStore`.
+To reference a CASObject, there are few different abstractions provided
+with different trade-offs:
+
+### ObjectRef
+
+`ObjectRef` is a lightweight reference to a CASObject stored in the CAS.
+This is the most commonly used abstraction and it is cheap to copy/pass
+along. It has following properties:
+
+* `ObjectRef` is only meaningful within the `ObjectStore` that created the ref.
+`ObjectRef` created by different `ObjectStore` cannot be cross-referenced or
+compared.
+* `ObjectRef` doesn't guarantee the existence of the CASObject it points to. An
+explicit load is required before accessing the data stored in CASObject.
+This load can also fail, for reasons like (but not limited to): object does
+not exist, corrupted CAS storage, operation timeout, etc.
+* If two `ObjectRef` are equal, it is guaranteed that the object they point to
+are identical (if they exist). If they are not equal, the underlying objects are
+guaranteed to be not the same.
+
+### ObjectProxy
+
+`ObjectProxy` represents a loaded CASObject. With an `ObjectProxy`, the
+underlying stored data and references can be accessed without the need
+of error handling. The class APIs also provide convenient methods to
+access underlying data. The lifetime of the underlying data is equal to
+the lifetime of the instance of `ObjectStore` unless explicitly copied.
+
+### CASID
+
+`CASID` is the hash identifier for CASObjects. It owns the underlying
+storage for hash value so it can be expensive to copy and compare depending
+on the hash algorithm. `CASID` is generally only useful in rare situations
+like printing raw hash value or exchanging hash values between different
+CAS instances with the same hashing schema.
+
+### ObjectStore
+
+`ObjectStore` is the CAS-like object storage. It provides API to save
+and load CASObjects, for example:
+
+```
+ObjectRef A, B, C;
+Expected<ObjectRef> Stored = ObjectStore.store("data", {A, B});
+Expected<ObjectProxy> Loaded = ObjectStore.getProxy(C);
+```
+
+It also provides APIs to convert between `ObjectRef`, `ObjectProxy` and
+`CASID`.
+
+
+
+## CAS Library Implementation Guide
+
+The LLVM ObjectStore API was designed so that it is easy to add
+customized CAS implementations that are interchangeable with the builtin
+ones.
+
+To add your own implementation, you just need to add a subclass to
+`llvm::cas::ObjectStore` and implement all its pure virtual methods.
+To be interchangeable with LLVM ObjectStore, the new CAS implementation
+needs to conform to following contracts:
+
+* Different CASObjects stored in the ObjectStore need to have a different hash
+and result in a different `ObjectRef`. Similarly, the same CASObject should have
+the same hash and the same `ObjectRef`. Note: two different CASObjects with
+identical data but different references are considered different objects.
+* `ObjectRef`s are only comparable within the same `ObjectStore` instance, and
+can be used to determine the equality of the underlying CASObjects.
+* The loaded objects from the ObjectStore need to have a lifetime at least as
+long as the ObjectStore itself so it is always legal to access the loaded data
+without holding on the `ObjectProxy` until the `ObjectStore` is destroyed.
+
+
+If not specified, the behavior can be implementation defined. For example,
+`ObjectRef` can be used to point to a loaded CASObject so
+`ObjectStore` never fails to load. It is also legal to use a stricter model
+than required. For example, the underlying value inside `ObjectRef` can be
+the unique indentities of the objects across multiple `ObjectStore` instances,
+but comparing such `ObjectRef` from different `ObjectStore` is still illegal.
+
+For CAS library implementers, there is also an `ObjectHandle` class that
+is an internal representation of a loaded CASObject reference.
+`ObjectProxy` is just a pair of `ObjectHandle` and `ObjectStore`, and
+just like `ObjectRef`, `ObjectHandle` is only useful when paired with
+the `ObjectStore` that knows about the loaded CASObject.
diff --git a/llvm/docs/GlobalISel/Legalizer.rst b/llvm/docs/GlobalISel/Legalizer.rst
index 0e256ac..74e83ae 100644
--- a/llvm/docs/GlobalISel/Legalizer.rst
+++ b/llvm/docs/GlobalISel/Legalizer.rst
@@ -15,7 +15,7 @@ A legal instruction is defined as:
 * operating on **vregs that can be loaded and stored** -- if necessary, the
   target can select a ``G_LOAD``/``G_STORE`` of each gvreg operand.
 
-As opposed to SelectionDAG, there are no legalization phases.  In particular,
+Unlike SelectionDAG, there are no legalization phases.  In particular,
 'type' and 'operation' legalization are not separate.
 
 Legalization is iterative, and all state is contained in GMIR.  To maintain the
@@ -44,7 +44,7 @@ have a ``G_FOO`` instruction of the form::
   %1:_(s32) = G_CONSTANT i32 1
   %2:_(s32) = G_FOO %0:_(s32), %1:_(s32)
 
-it's impossible to say that G_FOO is legal iff %1 is a ``G_CONSTANT`` with
+it's impossible to say that ``G_FOO`` is legal iff %1 is a ``G_CONSTANT`` with
 value ``1``. However, the following::
 
   %2:_(s32) = G_FOO %0:_(s32), i32 1
@@ -93,8 +93,8 @@ legality contains:
 
 .. rubric:: Footnotes
 
-.. [#legalizer-legacy-footnote] An API is broadly similar to
-   SelectionDAG/TargetLowering is available but is not recommended as a more
+.. [#legalizer-legacy-footnote] An API that is broadly similar to
+   SelectionDAG/TargetLowering is available, but is not recommended as a more
    powerful API is available.
 
 Rule Processing and Declaring Rules
@@ -108,10 +108,10 @@ legalized as a result of the rules. If the ruleset is exhausted without
 satisfying any rule, then it is considered unsupported.
 
 When it doesn't declare the instruction legal, each pass over the rules may
-request that one type changes to another type. Sometimes this can cause multiple
+request that one type be changed to another type. Sometimes this can cause multiple
 types to change but we avoid this as much as possible as making multiple changes
 can make it difficult to avoid infinite loops where, for example, narrowing one
-type causes another to be too small and widening that type causes the first one
+type causes another to be too small, and widening that type causes the first one
 to be too big.
 
 In general, it's advisable to declare instructions legal as close to the top of
@@ -130,7 +130,7 @@ and the instruction::
 
   %2:_(s7) = G_ADD %0:_(s7), %1:_(s7)
 
-this doesn't meet the predicate for the :ref:`.legalFor() <legalfor>` as ``s7``
+This doesn't meet the predicate for the :ref:`.legalFor() <legalfor>` as ``s7``
 is not one of the listed types so it falls through to the
 :ref:`.clampScalar() <clampscalar>`. It does meet the predicate for this rule
 as the type is smaller than the ``s32`` and this rule instructs the legalizer
@@ -148,7 +148,7 @@ processing by the legalizer.
 Rule Actions
 """"""""""""
 
-There are various rule factories that append rules to a ruleset but they have a
+There are various rule factories that append rules to a ruleset, but they have a
 few actions in common:
 
 .. _legalfor:
@@ -202,7 +202,7 @@ few actions in common:
 Rule Predicates
 """""""""""""""
 
-The rule factories also have predicates in common:
+The rule factories also have the following predicates in common:
 
 * ``legal()``, ``lower()``, etc. are always satisfied.
 
@@ -269,81 +269,81 @@ Consumer Type Set
   The set of types which is the union of all possible types consumed by at
   least one legal instruction.
 
-Both sets are often identical but there's no guarantee of that. For example,
+Both sets are often identical, but there's no guarantee of that. For example,
 it's not uncommon to be unable to consume s64 but still be able to produce it
 for a few specific instructions.
 
 Minimum Rules For Scalars
 """""""""""""""""""""""""
 
-* G_ANYEXT must be legal for all inputs from the producer type set and all larger
+* ``G_ANYEXT`` must be legal for all inputs from the producer type set and all larger
   outputs from the consumer type set.
-* G_TRUNC must be legal for all inputs from the producer type set and all
+* ``G_TRUNC`` must be legal for all inputs from the producer type set and all
   smaller outputs from the consumer type set.
 
-G_ANYEXT, and G_TRUNC have mandatory legality since the GMIR requires a means to
+``G_ANYEXT`` and ``G_TRUNC`` have mandatory legality since the GMIR requires a means to
 connect operations with different type sizes. They are usually trivial to support
-since G_ANYEXT doesn't define the value of the additional bits and G_TRUNC is
-discarding bits. The other conversions can be lowered into G_ANYEXT/G_TRUNC
+since ``G_ANYEXT`` doesn't define the value of the additional bits and ``G_TRUNC`` is
+discarding bits. The other conversions can be lowered into ``G_ANYEXT``/``G_TRUNC``
 with some additional operations that are subject to further legalization. For
-example, G_SEXT can lower to::
+example, ``G_SEXT`` can lower to::
 
   %1 = G_ANYEXT %0
   %2 = G_CONSTANT ...
   %3 = G_SHL %1, %2
   %4 = G_ASHR %3, %2
 
-and the G_CONSTANT/G_SHL/G_ASHR can further lower to other operations or target
-instructions. Similarly, G_FPEXT has no legality requirement since it can lower
-to a G_ANYEXT followed by a target instruction.
+and the ``G_CONSTANT``/``G_SHL``/``G_ASHR`` can further lower to other operations or target
+instructions. Similarly, ``G_FPEXT`` has no legality requirement since it can lower
+to a ``G_ANYEXT`` followed by a target instruction.
 
-G_MERGE_VALUES and G_UNMERGE_VALUES do not have legality requirements since the
-former can lower to G_ANYEXT and some other legalizable instructions, while the
-latter can lower to some legalizable instructions followed by G_TRUNC.
+``G_MERGE_VALUES`` and ``G_UNMERGE_VALUES`` do not have legality requirements since the
+former can lower to ``G_ANYEXT`` and some other legalizable instructions, while the
+latter can lower to some legalizable instructions followed by ``G_TRUNC``.
 
 Minimum Legality For Vectors
 """"""""""""""""""""""""""""
 
 Within the vector types, there aren't any defined conversions in LLVM IR as
 vectors are often converted by reinterpreting the bits or by decomposing the
-vector and reconstituting it as a different type. As such, G_BITCAST is the
+vector and reconstituting it as a different type. As such, ``G_BITCAST`` is the
 only operation to account for. We generally don't require that it's legal
-because it can usually be lowered to COPY (or to nothing using
-replaceAllUses()). However, there are situations where G_BITCAST is non-trivial
+because it can usually be lowered to ``COPY`` (or to nothing using
+``replaceAllUses()``). However, there are situations where ``G_BITCAST`` is non-trivial
 (e.g. little-endian vectors of big-endian data such as on big-endian MIPS MSA and
-big-endian ARM NEON, see `_i_bitcast`). To account for this G_BITCAST must be
+big-endian ARM NEON, see `_i_bitcast`). To account for this, ``G_BITCAST`` must be
 legal for all type combinations that change the bit pattern in the value.
 
-There are no legality requirements for G_BUILD_VECTOR, or G_BUILD_VECTOR_TRUNC
+There are no legality requirements for ``G_BUILD_VECTOR``, or ``G_BUILD_VECTOR_TRUNC``
 since these can be handled by:
 * Declaring them legal.
 * Scalarizing them.
-* Lowering them to G_TRUNC+G_ANYEXT and some legalizable instructions.
+* Lowering them to ``G_TRUNC``+``G_ANYEXT`` and some legalizable instructions.
 * Lowering them to target instructions which are legal by definition.
 
-The same reasoning also allows G_UNMERGE_VALUES to lack legality requirements
+The same reasoning also allows ``G_UNMERGE_VALUES`` to lack legality requirements
 for vector inputs.
 
 Minimum Legality for Pointers
 """""""""""""""""""""""""""""
 
-There are no minimum rules for pointers since G_INTTOPTR and G_PTRTOINT can
-be selected to a COPY from register class to another by the legalizer.
+There are no minimum rules for pointers since ``G_INTTOPTR`` and ``G_PTRTOINT`` can
+be selected to a ``COPY`` from register class to another by the legalizer.
 
 Minimum Legality For Operations
 """""""""""""""""""""""""""""""
 
-The rules for G_ANYEXT, G_MERGE_VALUES, G_BITCAST, G_BUILD_VECTOR,
-G_BUILD_VECTOR_TRUNC, G_CONCAT_VECTORS, G_UNMERGE_VALUES, G_PTRTOINT, and
-G_INTTOPTR have already been noted above. In addition to those, the following
+The rules for ``G_ANYEXT``, ``G_MERGE_VALUES``, ``G_BITCAST``, ``G_BUILD_VECTOR``,
+``G_BUILD_VECTOR_TRUNC``, ``G_CONCAT_VECTORS``, ``G_UNMERGE_VALUES``, ``G_PTRTOINT``, and
+``G_INTTOPTR`` have already been noted above. In addition to those, the following
 operations have requirements:
 
-* For every type that can be produced by any instruction, G_IMPLICIT_DEF must be
-  legal.
-* G_PHI must be legal for all types in the producer and consumer typesets. This
+* ``G_IMPLICIT_DEF`` must be legal for every type that can be produced
+   by any instruction.
+* ``G_PHI`` must be legal for all types in the producer and consumer typesets. This
   is usually trivial as it requires no code to be selected.
-* At least one G_FRAME_INDEX must be legal
-* At least one G_BLOCK_ADDR must be legal
+* At least one ``G_FRAME_INDEX`` must be legal
+* At least one ``G_BLOCK_ADDR`` must be legal
 
-There are many other operations you'd expect to have legality requirements but
+There are many other operations you'd expect to have legality requirements, but
 they can be lowered to target instructions which are legal by definition.
diff --git a/llvm/docs/HowToCrossCompileBuiltinsOnArm.rst b/llvm/docs/HowToCrossCompileBuiltinsOnArm.rst
index 31ead45..d7759ad 100644
--- a/llvm/docs/HowToCrossCompileBuiltinsOnArm.rst
+++ b/llvm/docs/HowToCrossCompileBuiltinsOnArm.rst
@@ -25,12 +25,16 @@ using as many of the LLVM tools as we can, but it is possible to use GNU
 equivalents.
 
 You will need:
- * A build of LLVM for the llvm-tools and ``llvm-config``.
+ * A build of LLVM for the llvm-tools and LLVM CMake files.
  * A clang executable with support for the ``ARM`` target.
- * compiler-rt sources.
+ * ``compiler-rt`` sources.
  * The ``qemu-arm`` user mode emulator.
  * An ``arm-linux-gnueabihf`` sysroot.
 
+.. note::
+  An existing sysroot is required because some of the builtins include C library
+  headers and a sysroot is the easiest way to get those.
+
 In this example we will be using ``ninja`` as the build tool.
 
 See https://compiler-rt.llvm.org/ for information about the dependencies
@@ -52,78 +56,94 @@ toolchain from https://developer.arm.com/open-source/gnu-toolchain/gnu-a/downloa
 Building compiler-rt builtins for Arm
 =====================================
 
-We will be doing a standalone build of compiler-rt using the following cmake
-options::
+We will be doing a standalone build of compiler-rt. The command is shown below.
+Shell variables are used to simplify some of the options::
+
+  LLVM_TOOLCHAIN=<path-to-llvm-install>/
+  TARGET_TRIPLE=arm-none-linux-gnueabihf
+  GCC_TOOLCHAIN=<path-to-gcc-toolchain>
+  SYSROOT=${GCC_TOOLCHAIN}/${TARGET_TRIPLE}/libc
+  COMPILE_FLAGS="-march=armv7-a"
 
-  cmake path/to/compiler-rt \
+  cmake ../llvm-project/compiler-rt \
     -G Ninja \
-    -DCMAKE_AR=/path/to/llvm-ar \
-    -DCMAKE_ASM_COMPILER_TARGET="arm-linux-gnueabihf" \
-    -DCMAKE_ASM_FLAGS="build-c-flags" \
-    -DCMAKE_C_COMPILER=/path/to/clang \
-    -DCMAKE_C_COMPILER_TARGET="arm-linux-gnueabihf" \
-    -DCMAKE_C_FLAGS="build-c-flags" \
+    -DCMAKE_AR=${LLVM_TOOLCHAIN}/bin/llvm-ar \
+    -DCMAKE_NM=${LLVM_TOOLCHAIN}/bin/llvm-nm \
+    -DCMAKE_RANLIB=${LLVM_TOOLCHAIN}/bin/llvm-ranlib \
+    -DLLVM_CMAKE_DIR="${LLVM_TOOLCHAIN}/lib/cmake/llvm" \
+    -DCMAKE_SYSROOT="${SYSROOT}" \
+    -DCMAKE_ASM_COMPILER_TARGET="${TARGET_TRIPLE}" \
+    -DCMAKE_ASM_FLAGS="${COMPILE_FLAGS}" \
+    -DCMAKE_C_COMPILER_TARGET="${TARGET_TRIPLE}" \
+    -DCMAKE_C_COMPILER_EXTERNAL_TOOLCHAIN=${GCC_TOOLCHAIN} \
+    -DCMAKE_C_COMPILER=${LLVM_TOOLCHAIN}/bin/clang \
+    -DCMAKE_C_FLAGS="${COMPILE_FLAGS}" \
+    -DCMAKE_CXX_COMPILER_TARGET="${TARGET_TRIPLE}" \
+    -DCMAKE_CXX_COMPILER_EXTERNAL_TOOLCHAIN=${GCC_TOOLCHAIN} \
+    -DCMAKE_CXX_COMPILER=${LLVM_TOOLCHAIN}/bin/clang \
+    -DCMAKE_CXX_FLAGS="${COMPILE_FLAGS}" \
     -DCMAKE_EXE_LINKER_FLAGS="-fuse-ld=lld" \
-    -DCMAKE_NM=/path/to/llvm-nm \
-    -DCMAKE_RANLIB=/path/to/llvm-ranlib \
     -DCOMPILER_RT_BUILD_BUILTINS=ON \
     -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \
     -DCOMPILER_RT_BUILD_MEMPROF=OFF \
     -DCOMPILER_RT_BUILD_PROFILE=OFF \
+    -DCOMPILER_RT_BUILD_CTX_PROFILE=OFF \
     -DCOMPILER_RT_BUILD_SANITIZERS=OFF \
     -DCOMPILER_RT_BUILD_XRAY=OFF \
+    -DCOMPILER_RT_BUILD_ORC=OFF \
+    -DCOMPILER_RT_BUILD_CRT=OFF \
     -DCOMPILER_RT_DEFAULT_TARGET_ONLY=ON \
-    -DLLVM_CONFIG_PATH=/path/to/llvm-config
+    -DCOMPILER_RT_EMULATOR="qemu-arm -L ${SYSROOT}" \
+    -DCOMPILER_RT_INCLUDE_TESTS=ON \
+    -DCOMPILER_RT_TEST_COMPILER=${LLVM_TOOLCHAIN}/bin/clang \
+    -DCOMPILER_RT_TEST_COMPILER_CFLAGS="--target=${TARGET_TRIPLE} ${COMPILE_FLAGS} --gcc-toolchain=${GCC_TOOLCHAIN} --sysroot=${SYSROOT} -fuse-ld=lld"
 
-The ``build-c-flags`` need to be sufficient to pass the C-make compiler check,
-compile compiler-rt, and if you are running the tests, compile and link the
-tests. When cross-compiling with clang we will need to pass sufficient
-information to generate code for the Arm architecture we are targeting.
+.. note::
+  The command above also enables tests. Enabling tests is not required, more details
+  in the testing section.
 
-We will need to select:
- * The Arm target and Armv7-A architecture with ``--target=arm-linux-gnueabihf -march=armv7a``.
- * Whether to generate Arm (the default) or Thumb instructions (``-mthumb``).
+``CMAKE_<LANGUAGE>_<OPTION>`` options are set so that the correct ``--target``,
+``--sysroot``, ``--gcc-toolchain`` and ``-march`` options will be given to the
+compilers.
 
-When using a GCC ``arm-linux-gnueabihf`` toolchain the following flags are
-needed to pick up the includes and libraries:
+The combination of these settings needs to be enough to pass CMake's compiler
+checks, compile compiler-rt and build the test cases.
 
- * ``--gcc-toolchain=/path/to/dir/toolchain``
- * ``--sysroot=/path/to/toolchain/arm-linux-gnueabihf/libc``
+The flags need to select:
+ * The Arm target (``--target arm-none-linux-gnueabihf``)
+ * The Arm architecture level (``-march=armv7-a``)
+ * Whether to generate Arm (``-marm``, the default) or Thumb (``-mthumb``) instructions.
 
-In this example we will be adding all of the command line options to both
-``CMAKE_C_FLAGS`` and ``CMAKE_ASM_FLAGS``. There are cmake flags to pass some of
-these options individually which can be used to simplify the ``build-c-flags``::
+It is possible to pass all these flags to CMake using ``CMAKE_<LANGUAGE>_FLAGS``,
+but the command above uses standard CMake options instead. If you need to
+add flags that CMake cannot generate automatically, add them to
+``CMAKE_<LANGUAGE>_FLAGS``.
 
- -DCMAKE_C_COMPILER_TARGET="arm-linux-gnueabihf"
- -DCMAKE_ASM_COMPILER_TARGET="arm-linux-gnueabihf"
- -DCMAKE_C_COMPILER_EXTERNAL_TOOLCHAIN=/path/to/dir/toolchain
- -DCMAKE_SYSROOT=/path/to/dir/toolchain/arm-linux-gnueabihf/libc
+When CMake has finished, build with Ninja::
 
-Once cmake has completed the builtins can be built with ``ninja builtins``
+  ninja builtins
 
 Testing compiler-rt builtins using qemu-arm
 ===========================================
 
-To test the builtins library we need to add a few more cmake flags to enable
-testing and set up the compiler and flags for test case. We must also tell
-cmake that we wish to run the tests on ``qemu-arm``::
+The following options are required to enable tests::
+
+ -DCOMPILER_RT_EMULATOR="qemu-arm -L ${SYSROOT}" \
+ -DCOMPILER_RT_INCLUDE_TESTS=ON \
+ -DCOMPILER_RT_TEST_COMPILER=${LLVM_TOOLCHAIN}/bin/clang \
+ -DCOMPILER_RT_TEST_COMPILER_CFLAGS="--target=${TARGET_TRIPLE} -march=armv7-a --gcc-toolchain=${GCC_TOOLCHAIN} --sysroot=${SYSROOT} -fuse-ld=lld"
 
- -DCOMPILER_RT_EMULATOR="qemu-arm -L /path/to/armhf/sysroot"
- -DCOMPILER_RT_INCLUDE_TESTS=ON
- -DCOMPILER_RT_TEST_COMPILER="/path/to/clang"
- -DCOMPILER_RT_TEST_COMPILER_CFLAGS="test-c-flags"
+This tells compiler-rt that we want to run tests on ``qemu-arm``. If you do not
+want to run tests, remove these options from the CMake command.
 
-The ``/path/to/armhf/sysroot`` should be the same as the one passed to
-``--sysroot`` in the ``build-c-flags``.
+Note that ``COMPILER_RT_TEST_COMPILER_CFLAGS`` contains the equivalent of the
+options CMake generated for us with the first command. We must pass them
+manually here because standard options like ``CMAKE_C_COMPILER_EXTERNAL_TOOLCHAIN``
+do not apply here.
 
-The ``test-c-flags`` need to include the target, architecture, gcc-toolchain,
-sysroot and Arm/Thumb state. The additional cmake defines such as
-``CMAKE_C_COMPILER_EXTERNAL_TOOLCHAIN`` do not apply when building the tests. If
-you have put all of these in ``build-c-flags`` then these can be repeated. If you
-wish to use lld to link the tests then add ``-fuse-ld=lld``.
+When CMake has finished, run the tests::
 
-Once cmake has completed the tests can be built and run using
-``ninja check-builtins``
+  ninja check-builtins
 
 Troubleshooting
 ===============
@@ -133,9 +153,10 @@ The cmake try compile stage fails
 At an early stage cmake will attempt to compile and link a simple C program to
 test if the toolchain is working.
 
-This stage can often fail at link time if the ``--sysroot=`` and
+This stage can often fail at link time if the ``--sysroot=``, ``--target`` or
 ``--gcc-toolchain=`` options are not passed to the compiler. Check the
-``CMAKE_C_FLAGS`` and ``CMAKE_C_COMPILER_TARGET`` flags.
+``CMAKE_<LANGUAGE>_FLAGS`` and ``CMAKE_<LANGAUGE>_COMPILER_TARGET`` flags along
+with any of the specific CMake sysroot and toolchain options.
 
 It can be useful to build a simple example outside of cmake with your toolchain
 to make sure it is working. For example::
@@ -179,10 +200,10 @@ The flags used to build the tests are not the same as those used to build the
 builtins. The c flags are provided by ``COMPILER_RT_TEST_COMPILE_CFLAGS`` and
 the ``CMAKE_C_COMPILER_TARGET``, ``CMAKE_ASM_COMPILER_TARGET``,
 ``CMAKE_C_COMPILER_EXTERNAL_TOOLCHAIN`` and ``CMAKE_SYSROOT`` flags are not
-applied.
+applied to tests.
 
 Make sure that ``COMPILER_RT_TEST_COMPILE_CFLAGS`` contains all the necessary
-information.
+flags.
 
 
 Modifications for other Targets
@@ -206,13 +227,13 @@ You will need to use an ``arm-linux-gnueabi`` GNU toolchain for soft-float.
 AArch64 Target
 --------------
 The instructions for Arm can be used for AArch64 by substituting AArch64
-equivalents for the sysroot, emulator and target.
+equivalents for the sysroot, emulator and target::
 
-* ``-DCMAKE_C_COMPILER_TARGET=aarch64-linux-gnu``
-* ``-DCOMPILER_RT_EMULATOR="qemu-aarch64 -L /path/to/aarch64/sysroot``
+ -DCMAKE_C_COMPILER_TARGET=aarch64-linux-gnu
+ -DCOMPILER_RT_EMULATOR="qemu-aarch64 -L /path/to/aarch64/sysroot
 
-The CMAKE_C_FLAGS and COMPILER_RT_TEST_COMPILER_CFLAGS may also need:
-``"--sysroot=/path/to/aarch64/sysroot --gcc-toolchain=/path/to/gcc-toolchain"``
+You will also have to update any use of the target triple in compiler flags.
+For instance in ``CMAKE_C_FLAGS`` and ``COMPILER_RT_TEST_COMPILER_CFLAGS``.
 
 Armv6-m, Armv7-m and Armv7E-M targets
 -------------------------------------
@@ -221,7 +242,7 @@ but more difficult. The main problems are:
 
 * There is not a ``qemu-arm`` user-mode emulator for bare-metal systems.
   ``qemu-system-arm`` can be used but this is significantly more difficult
-  to setup.
+  to setup. This document does not explain how to do this.
 * The targets to compile compiler-rt have the suffix ``-none-eabi``. This uses
   the BareMetal driver in clang and by default will not find the libraries
   needed to pass the cmake compiler check.
@@ -235,31 +256,68 @@ into a binary and execute the tests correctly but it will not catch if the
 builtins use instructions that are supported on Armv7-A but not Armv6-M,
 Armv7-M and Armv7E-M.
 
-To get the cmake compile test to pass you will need to pass the libraries
-needed to successfully link the cmake test via ``CMAKE_CFLAGS``::
-
- -DCMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY \
- -DCOMPILER_RT_OS_DIR="baremetal" \
- -DCOMPILER_RT_BUILD_BUILTINS=ON \
- -DCOMPILER_RT_BUILD_SANITIZERS=OFF \
- -DCOMPILER_RT_BUILD_XRAY=OFF \
- -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \
- -DCOMPILER_RT_BUILD_PROFILE=OFF \
- -DCMAKE_C_COMPILER=${host_install_dir}/bin/clang \
- -DCMAKE_C_COMPILER_TARGET="your *-none-eabi target" \
- -DCMAKE_ASM_COMPILER_TARGET="your *-none-eabi target" \
- -DCMAKE_AR=/path/to/llvm-ar \
- -DCMAKE_NM=/path/to/llvm-nm \
- -DCMAKE_RANLIB=/path/to/llvm-ranlib \
- -DCOMPILER_RT_BAREMETAL_BUILD=ON \
- -DCOMPILER_RT_DEFAULT_TARGET_ONLY=ON \
- -DLLVM_CONFIG_PATH=/path/to/llvm-config \
- -DCMAKE_C_FLAGS="build-c-flags" \
- -DCMAKE_ASM_FLAGS="build-c-flags" \
- -DCOMPILER_RT_EMULATOR="qemu-arm -L /path/to/armv7-A/sysroot" \
- -DCOMPILER_RT_INCLUDE_TESTS=ON \
- -DCOMPILER_RT_TEST_COMPILER="/path/to/clang" \
- -DCOMPILER_RT_TEST_COMPILER_CFLAGS="test-c-flags"
+This requires a second ``arm-none-eabi`` toolchain for building the builtins.
+Using a bare-metal toolchain ensures that the target and C library details are
+specific to bare-metal instead of using Linux settings. This means that some
+tests may behave differently compared to real hardware, but at least the content
+of the builtins library is correct.
+
+Below is an example that builds the builtins for Armv7-M, but runs the tests
+as Armv7-A. It is presented in full, but is very similar to the earlier
+command for Armv7-A build and test::
+
+  LLVM_TOOLCHAIN=<path to llvm install>/
+
+  # For the builtins.
+  TARGET_TRIPLE=arm-none-eabi
+  GCC_TOOLCHAIN=<path to arm-none-eabi toolchain>/
+  SYSROOT=${GCC_TOOLCHAIN}/${TARGET_TRIPLE}/libc
+  COMPILE_FLAGS="-march=armv7-m -mfpu=vfpv2"
+
+  # For the test cases.
+  A_PROFILE_TARGET_TRIPLE=arm-none-linux-gnueabihf
+  A_PROFILE_GCC_TOOLCHAIN=<path to arm-none-linux-gnueabihf toolchain>/
+  A_PROFILE_SYSROOT=${A_PROFILE_GCC_TOOLCHAIN}/${A_PROFILE_TARGET_TRIPLE}/libc
+
+  cmake ../llvm-project/compiler-rt \
+    -G Ninja \
+    -DCMAKE_AR=${LLVM_TOOLCHAIN}/bin/llvm-ar \
+    -DCMAKE_NM=${LLVM_TOOLCHAIN}/bin/llvm-nm \
+    -DCMAKE_RANLIB=${LLVM_TOOLCHAIN}/bin/llvm-ranlib \
+    -DLLVM_CMAKE_DIR="${LLVM_TOOLCHAIN}/lib/cmake/llvm" \
+    -DCMAKE_SYSROOT="${SYSROOT}" \
+    -DCMAKE_ASM_COMPILER_TARGET="${TARGET_TRIPLE}" \
+    -DCMAKE_ASM_FLAGS="${COMPILE_FLAGS}" \
+    -DCMAKE_C_COMPILER_TARGET="${TARGET_TRIPLE}" \
+    -DCMAKE_C_COMPILER_EXTERNAL_TOOLCHAIN=${GCC_TOOLCHAIN} \
+    -DCMAKE_C_COMPILER=${LLVM_TOOLCHAIN}/bin/clang \
+    -DCMAKE_C_FLAGS="${COMPILE_FLAGS}" \
+    -DCMAKE_CXX_COMPILER_TARGET="${TARGET_TRIPLE}" \
+    -DCMAKE_CXX_COMPILER_EXTERNAL_TOOLCHAIN=${GCC_TOOLCHAIN} \
+    -DCMAKE_CXX_COMPILER=${LLVM_TOOLCHAIN}/bin/clang \
+    -DCMAKE_CXX_FLAGS="${COMPILE_FLAGS}" \
+    -DCMAKE_EXE_LINKER_FLAGS="-fuse-ld=lld" \
+    -DCOMPILER_RT_BUILD_BUILTINS=ON \
+    -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \
+    -DCOMPILER_RT_BUILD_MEMPROF=OFF \
+    -DCOMPILER_RT_BUILD_PROFILE=OFF \
+    -DCOMPILER_RT_BUILD_CTX_PROFILE=OFF \
+    -DCOMPILER_RT_BUILD_SANITIZERS=OFF \
+    -DCOMPILER_RT_BUILD_XRAY=OFF \
+    -DCOMPILER_RT_BUILD_ORC=OFF \
+    -DCOMPILER_RT_BUILD_CRT=OFF \
+    -DCOMPILER_RT_DEFAULT_TARGET_ONLY=ON \
+    -DCOMPILER_RT_EMULATOR="qemu-arm -L ${A_PROFILE_SYSROOT}" \
+    -DCOMPILER_RT_INCLUDE_TESTS=ON \
+    -DCOMPILER_RT_TEST_COMPILER=${LLVM_TOOLCHAIN}/bin/clang \
+    -DCOMPILER_RT_TEST_COMPILER_CFLAGS="--target=${A_PROFILE_TARGET_TRIPLE} -march=armv7-a --gcc-toolchain=${A_PROFILE_GCC_TOOLCHAIN} --sysroot=${A_PROFILE_SYSROOT} -fuse-ld=lld" \
+    -DCMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY \
+    -DCOMPILER_RT_OS_DIR="baremetal" \
+    -DCOMPILER_RT_BAREMETAL_BUILD=ON
+
+.. note::
+  The sysroot used for compiling the tests is ``arm-linux-gnueabihf``, not
+  ``arm-none-eabi`` which is used when compiling the builtins.
 
 The Armv6-M builtins will use the soft-float ABI. When compiling the tests for
 Armv7-A we must include ``"-mthumb -mfloat-abi=soft -mfpu=none"`` in the
@@ -270,19 +328,3 @@ mismatches between the M-profile objects from compiler-rt and the A-profile
 objects from the test. The lld linker does not check the profile
 BuildAttribute so it can be used to link the tests by adding ``-fuse-ld=lld`` to the
 ``COMPILER_RT_TEST_COMPILER_CFLAGS``.
-
-Alternative using a cmake cache
--------------------------------
-If you wish to build, but not test compiler-rt for Armv6-M, Armv7-M or Armv7E-M
-the easiest way is to use the ``BaremetalARM.cmake`` recipe in ``clang/cmake/caches``.
-
-You will need a bare metal sysroot such as that provided by the GNU ARM Embedded
-toolchain.
-
-The libraries can be built with the cmake options::
-
- -DBAREMETAL_ARMV6M_SYSROOT=/path/to/bare/metal/toolchain/arm-none-eabi \
- -DBAREMETAL_ARMV7M_SYSROOT=/path/to/bare/metal/toolchain/arm-none-eabi \
- -DBAREMETAL_ARMV7EM_SYSROOT=/path/to/bare/metal/toolchain/arm-none-eabi \
- -C /path/to/llvm/source/tools/clang/cmake/caches/BaremetalARM.cmake \
- /path/to/llvm
diff --git a/llvm/docs/HowToReleaseLLVM.rst b/llvm/docs/HowToReleaseLLVM.rst
index dd4bb08..f3792e3 100644
--- a/llvm/docs/HowToReleaseLLVM.rst
+++ b/llvm/docs/HowToReleaseLLVM.rst
@@ -101,8 +101,8 @@ release process to begin.  Specifically, it involves:
 
 * Tagging release candidates for the release team to begin testing.
 
-Create Release Branch
-^^^^^^^^^^^^^^^^^^^^^
+Create Release Branch and Update LLVM Version
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 Branch the Git trunk using the following procedure:
 
@@ -114,14 +114,16 @@ Branch the Git trunk using the following procedure:
 #. Verify that the current git trunk is in decent shape by
    examining nightly tester and buildbot results.
 
-#. Bump the version in trunk to N.0.0git and tag the commit with llvmorg-N-init.
+#. Bump the version in trunk to N.0.0git with the script in
+   ``llvm/utils/release/bump-version.py``, and tag the commit with llvmorg-N-init.
    If ``X`` is the version to be released, then ``N`` is ``X + 1``.
 
 ::
 
   $ git tag -sa llvmorg-N-init
 
-#. Clear the release notes in trunk.
+4. Clear the release notes in trunk with the script in
+   ``llvm/utils/release/clear-release-notes.py``.
 
 #. Create the release branch from the last known good revision from before the
    version bump.  The branch's name is release/X.x where ``X`` is the major version
@@ -133,12 +135,6 @@ Branch the Git trunk using the following procedure:
 #. All tags and branches need to be created in both the llvm/llvm-project and
    llvm/llvm-test-suite repos.
 
-Update LLVM Version
-^^^^^^^^^^^^^^^^^^^
-
-After creating the LLVM release branch, update the release branches'
-version with the script in ``llvm/utils/release/bump-version.py``.
-
 Tagging the LLVM Release Candidates
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 3a3a74f..a71eefd 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -392,7 +392,7 @@ added in the future:
     sequence in place of a call site. This convention forces the call
     arguments into registers but allows them to be dynamically
     allocated. This can currently only be used with calls to
-    llvm.experimental.patchpoint because only this intrinsic records
+    ``llvm.experimental.patchpoint`` because only this intrinsic records
     the location of its arguments in a side table. See :doc:`StackMaps`.
 "``preserve_mostcc``" - The `PreserveMost` calling convention
     This calling convention attempts to make the code in the caller as
@@ -413,7 +413,7 @@ added in the future:
     - On AArch64 the callee preserves all general purpose registers, except
       X0-X8 and X16-X18. Not allowed with ``nest``.
 
-    - On RISC-V the callee preserve x5-x31 except x6, x7 and x28 registers.
+    - On RISC-V the callee preserves x5-x31 except x6, x7 and x28 registers.
 
     The idea behind this convention is to support calls to runtime functions
     that have a hot path and a cold path. The hot path is usually a small piece
@@ -575,7 +575,7 @@ DLL storage classes:
     and the function or variable name. On XCOFF targets, ``dllexport`` indicates
     that the symbol will be made visible to other modules using "exported"
     visibility and thus placed by the linker in the loader section symbol table.
-    Since this storage class exists for defining a dll interface, the compiler,
+    Since this storage class exists for defining a DLL interface, the compiler,
     assembler and linker know it is externally referenced and must refrain from
     deleting the symbol.
 
@@ -610,7 +610,7 @@ model is not supported, or if a better choice of model can be made.
 A model can also be specified in an alias, but then it only governs how
 the alias is accessed. It will not have any effect on the aliasee.
 
-For platforms without linker support of ELF TLS model, the -femulated-tls
+For platforms without linker support of ELF TLS model, the ``-femulated-tls``
 flag can be used to generate GCC-compatible emulated TLS code.
 
 .. _runtime_preemption_model:
@@ -1887,7 +1887,7 @@ Attribute Groups
 
 Attribute groups are groups of attributes that are referenced by objects within
 the IR. They are important for keeping ``.ll`` files readable, because a lot of
-functions will use the same set of attributes. In the degenerative case of a
+functions will use the same set of attributes. In the degenerate case of a
 ``.ll`` file that corresponds to a single ``.c`` file, the single attribute
 group will capture the important command line flags used to build that file.
 
@@ -1946,8 +1946,8 @@ For example:
     ``::operator::delete``. Matching malloc/realloc/free calls within a family
     can be optimized, but mismatched ones will be left alone.
 ``allockind("KIND")``
-    Describes the behavior of an allocation function. The KIND string contains comma
-    separated entries from the following options:
+    Describes the behavior of an allocation function. The KIND string contains
+    comma-separated entries from the following options:
 
     * "alloc": the function returns a new block of memory or null.
     * "realloc": the function returns a new block of memory or null. If the
@@ -2047,7 +2047,7 @@ For example:
     even if this attribute says the frame pointer can be eliminated.
     The allowed string values are:
 
-     * ``"none"`` (default) - the frame pointer can be eliminated, and it's
+     * ``"none"`` (default) - the frame pointer can be eliminated, and its
        register can be used for other purposes.
      * ``"reserved"`` - the frame pointer register must either be updated to
        point to a valid frame record for the current function, or not be
@@ -2201,7 +2201,7 @@ For example:
 
     A ``nofree`` function is explicitly allowed to free memory which it
     allocated or (if not ``nosync``) arrange for another thread to free
-    memory on it's behalf.  As a result, perhaps surprisingly, a ``nofree``
+    memory on its behalf.  As a result, perhaps surprisingly, a ``nofree``
     function can return a pointer to a previously deallocated
     :ref:`allocated object<allocatedobjects>`.
 ``noimplicitfloat``
@@ -2232,14 +2232,14 @@ For example:
     may make calls to the function faster, at the cost of extra program
     startup time if the function is not called during program startup.
 ``noprofile``
-    This function attribute prevents instrumentation based profiling, used for
+    This function attribute prevents instrumentation-based profiling, used for
     coverage or profile based optimization, from being added to a function. It
     also blocks inlining if the caller and callee have different values of this
     attribute.
 ``skipprofile``
-    This function attribute prevents instrumentation based profiling, used for
+    This function attribute prevents instrumentation-based profiling, used for
     coverage or profile based optimization, from being added to a function. This
-    attribute does not restrict inlining, so instrumented instruction could end
+    attribute does not restrict inlining, so instrumented instructions could end
     up in this function.
 ``noredzone``
     This attribute indicates that the code generator should not use a
@@ -2339,7 +2339,7 @@ For example:
 
      * ``"prologue-short-redirect"`` - This style of patchable
        function is intended to support patching a function prologue to
-       redirect control away from the function in a thread safe
+       redirect control away from the function in a thread-safe
        manner.  It guarantees that the first instruction of the
        function will be large enough to accommodate a short jump
        instruction, and will be sufficiently aligned to allow being
@@ -2584,7 +2584,7 @@ For example:
 ``uwtable[(sync|async)]``
     This attribute indicates that the ABI being targeted requires that
     an unwind table entry be produced for this function even if we can
-    show that no exceptions passes by it. This is normally the case for
+    show that no exceptions pass by it. This is normally the case for
     the ELF x86-64 abi, but it can be disabled for some compilation
     units. The optional parameter describes what kind of unwind tables
     to generate: ``sync`` for normal unwind tables, ``async`` for asynchronous
@@ -2599,7 +2599,7 @@ For example:
 ``shadowcallstack``
     This attribute indicates that the ShadowCallStack checks are enabled for
     the function. The instrumentation checks that the return address for the
-    function has not changed between the function prolog and epilog. It is
+    function has not changed between the function prologue and epilogue. It is
     currently x86_64-specific.
 
 .. _langref_mustprogress:
@@ -2807,7 +2807,7 @@ operand bundle tag.  These operand bundles represent an alternate
 "safe" continuation for the call site they're attached to, and can be
 used by a suitable runtime to deoptimize the compiled frame at the
 specified call site.  There can be at most one ``"deopt"`` operand
-bundle attached to a call site.  Exact details of deoptimization is
+bundle attached to a call site.  Exact details of deoptimization are
 out of scope for the language reference, but it usually involves
 rewriting a compiled frame into a set of interpreted frames.
 
@@ -2896,7 +2896,7 @@ generated code.  For more details, see :ref:`GC Transitions
 
 The bundle contains an arbitrary list of Values which need to be passed
 to GC transition code. They will be lowered and passed as operands to
-the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
+the appropriate ``GC_TRANSITION`` nodes in the selection DAG. It is assumed
 that these arguments must be available before and after (but not
 necessarily during) the execution of the callee.
 
@@ -3334,7 +3334,7 @@ by the minus sign character ('-'). The canonical forms are:
 
 This information is passed along to the backend so that it generates
 code for the proper architecture. It's possible to override this on the
-command line with the ``-mtriple`` command line option.
+command line with the ``-mtriple`` command-line option.
 
 
 .. _allocatedobjects:
@@ -3641,8 +3641,8 @@ to support the somewhat common pattern in C of intentionally storing to an
 invalid pointer to crash the program. In the future, it might make sense to
 allow frontends to control this behavior.
 
-IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
-or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
+IR-level volatile loads and stores cannot safely be optimized into ``llvm.memcpy``
+or ``llvm.memmove`` intrinsics even when those intrinsics are flagged volatile.
 Likewise, the backend should never split or merge target-legal volatile
 load/store instructions. Similarly, IR-level volatile loads and stores cannot
 change from integer to floating-point or vice versa.
@@ -4289,7 +4289,7 @@ X86_amx Type
 :Overview:
 
 The x86_amx type represents a value held in an AMX tile register on an x86
-machine. The operations allowed on it are quite limited. Only few intrinsics
+machine. The operations allowed on it are quite limited. Only a few intrinsics
 are allowed: stride load and store, zero and dot product. No instruction is
 allowed for this type. There are no arguments, arrays, pointers, vectors
 or constants of this type.
@@ -5058,14 +5058,14 @@ Addresses of Basic Blocks
 The '``blockaddress``' constant computes the address of the specified
 basic block in the specified function.
 
-It always has an ``ptr addrspace(P)`` type, where ``P`` is the address space
+It always has a ``ptr addrspace(P)`` type, where ``P`` is the address space
 of the function containing ``%block`` (usually ``addrspace(0)``).
 
 Taking the address of the entry block is illegal.
 
 This value only has defined behavior when used as an operand to the
 ':ref:`indirectbr <i_indirectbr>`' or for comparisons against null. Pointer
-equality tests between labels addresses results in undefined behavior ---
+equality tests between label addresses results in undefined behavior ---
 though, again, comparison against null is ok, and no label is equal to the null
 pointer. This may be passed around as an opaque pointer sized value as long as
 the bits are not inspected. This allows ``ptrtoint`` and arithmetic to be
@@ -5098,7 +5098,7 @@ The target function may not have ``extern_weak`` linkage.
   to the function.
 - ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
   function. Many targets support relocations that resolve at link time to either
-  a function or a stub for it, depending on if the function is defined within the
+  a function or a stub for it, depending on whether the function is defined within the
   linkage unit; LLVM will use this when available. (This is commonly called a
   "PLT stub".) On other targets, the stub may need to be emitted explicitly.
 
@@ -5175,6 +5175,8 @@ The following is the syntax for constant expressions:
     Perform the :ref:`trunc operation <i_trunc>` on constants.
 ``ptrtoint (CST to TYPE)``
     Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
+``ptrtoaddr (CST to TYPE)``
+    Perform the :ref:`ptrtoaddr operation <i_ptrtoaddr>` on constants.
 ``inttoptr (CST to TYPE)``
     Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
     This one is *really* dangerous!
@@ -5318,7 +5320,7 @@ the '``unwind``' keyword, the behavior is undefined.
 
 If multiple keywords appear, the '``sideeffect``' keyword must come
 first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
-third and the '``unwind``' keyword last.
+third, and the '``unwind``' keyword last.
 
 Inline Asm Constraint String
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -5481,7 +5483,7 @@ followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
 The one and two letter constraint codes are typically chosen to be the same as
 GCC's constraint codes.
 
-A single constraint may include one or more than constraint code in it, leaving
+A single constraint may include one or more constraint codes in it, leaving
 it up to LLVM to choose which one to use. This is included mainly for
 compatibility with the translation of GCC inline asm coming from clang.
 
@@ -5580,9 +5582,13 @@ AArch64:
 AMDGPU:
 
 - ``r``: A 32 or 64-bit integer register.
-- ``[0-9]v``: The 32-bit VGPR register, number 0-9.
-- ``[0-9]s``: The 32-bit SGPR register, number 0-9.
-- ``[0-9]a``: The 32-bit AGPR register, number 0-9.
+- ``s``: SGPR register or tuple
+- ``v``: VGPR register or tuple
+- ``a``: AGPR register or tuple. Only valid on gfx908+.
+- ``VA``: VGPR or AGPR register or tuple. Only valid on gfx90a+.
+- ``v[0-9]``: The 32-bit VGPR register, number 0-9.
+- ``s[0-9]``: The 32-bit SGPR register, number 0-9.
+- ``a[0-9]``: The 32-bit AGPR register, number 0-9.
 - ``I``: An integer inline constant in the range from -16 to 64.
 - ``J``: A 16-bit signed integer constant.
 - ``A``: An integer or a floating-point inline constant.
@@ -6022,7 +6028,7 @@ Inline Asm Metadata
 The call instructions that wrap inline asm nodes may have a
 "``!srcloc``" MDNode attached to it that contains a list of constant
 integers. If present, the code generator will use the integer as the
-location cookie value when report errors through the ``LLVMContext``
+location cookie value when reporting errors through the ``LLVMContext``
 error reporting mechanisms. This allows a front-end to correlate backend
 errors that occur with inline asm back to the source code that produced
 it. For example:
@@ -6203,7 +6209,7 @@ Unlike instructions, global objects (functions and global variables) may have
 multiple metadata attachments with the same identifier.
 
 A transformation is required to drop any metadata attachment that it
-does not know or know it can't preserve. Currently there is an
+does not recognize or cannot preserve. Currently there is an
 exception for metadata attachment to globals for ``!func_sanitize``,
 ``!type``, ``!absolute_symbol`` and ``!associated`` which can't be
 unconditionally dropped unless the global is itself deleted.
@@ -6442,19 +6448,19 @@ descriptors <DISubrange>` or :ref:`subrange descriptors
 <DISubrangeType>`, each representing the range of subscripts at that
 level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates
 that an array type is a native packed vector. The optional
-``dataLocation`` is a DIExpression that describes how to get from an
+``dataLocation`` is a ``DIExpression`` that describes how to get from an
 object's address to the actual raw data, if they aren't
 equivalent. This is only supported for array types, particularly to
 describe Fortran arrays, which have an array descriptor in addition to
-the array data. Alternatively it can also be DIVariable which has the
+the array data. Alternatively it can also be ``DIVariable`` which has the
 address of the actual raw data. The Fortran language supports pointer
 arrays which can be attached to actual arrays, this attachment between
 pointer and pointee is called association.  The optional
-``associated`` is a DIExpression that describes whether the pointer
+``associated`` is a ``DIExpression`` that describes whether the pointer
 array is currently associated.  The optional ``allocated`` is a
-DIExpression that describes whether the allocatable array is currently
-allocated.  The optional ``rank`` is a DIExpression that describes the
-rank (number of dimensions) of fortran assumed rank array (rank is
+``DIExpression`` that describes whether the allocatable array is currently
+allocated.  The optional ``rank`` is a ``DIExpression`` that describes the
+rank (number of dimensions) of Fortran assumed rank array (rank is
 known at runtime).  The optional ``bitStride`` is an unsigned constant
 that describes the number of bits occupied by an element of the array;
 this is only needed if it differs from the element type's natural
@@ -6757,7 +6763,7 @@ expression language. They are used in :ref:`debug records <debugrecords>`
 referenced LLVM variable relates to the source language variable. Debug
 expressions are interpreted left-to-right: start by pushing the value/address
 operand of the record onto a stack, then repeatedly push and evaluate
-opcodes from the DIExpression until the final variable description is produced.
+opcodes from the ``DIExpression`` until the final variable description is produced.
 
 The current supported opcode vocabulary is limited:
 
@@ -6770,7 +6776,7 @@ The current supported opcode vocabulary is limited:
 - ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
 - ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
   here, respectively) of the variable fragment from the working expression. Note
-  that contrary to DW_OP_bit_piece, the offset is describing the location
+  that contrary to ``DW_OP_bit_piece``, the offset is describing the location
   within the described source variable.
 - ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
   (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
@@ -6838,15 +6844,15 @@ The current supported opcode vocabulary is limited:
   expression over two registers.
 - ``DW_OP_push_object_address`` pushes the address of the object which can then
   serve as a descriptor in subsequent calculation. This opcode can be used to
-  calculate bounds of fortran allocatable array which has array descriptors.
+  calculate bounds of an Fortran allocatable array which has array descriptors.
 - ``DW_OP_over`` duplicates the entry currently second in the stack at the top
-  of the stack. This opcode can be used to calculate bounds of fortran assumed
+  of the stack. This opcode can be used to calculate bounds of a Fortran assumed
   rank array which has rank known at run time and current dimension number is
   implicitly first element of the stack.
 - ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
   be used to represent pointer variables which are optimized out but the value
   it points to is known. This operator is required as it is different than DWARF
-  operator DW_OP_implicit_pointer in representation and specification (number
+  operator ``DW_OP_implicit_pointer`` in representation and specification (number
   and types of operands) and later can not be used as multiple level.
 
 .. code-block:: text
@@ -6883,22 +6889,22 @@ in registers or in memory (see ``DW_OP_stack_value``).
 
 A ``#dbg_declare`` record describes an indirect value (the address) of a
 source variable. The first operand of the record must be an address of some
-kind. A DIExpression operand to the record refines this address to produce a
+kind. A ``DIExpression`` operand to the record refines this address to produce a
 concrete location for the source variable.
 
 A ``#dbg_value`` record describes the direct value of a source variable.
 The first operand of the record may be a direct or indirect value. A
-DIExpression operand to the record refines the first operand to produce a
+``DIExpression`` operand to the record refines the first operand to produce a
 direct value. For example, if the first operand is an indirect value, it may be
-necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
+necessary to insert ``DW_OP_deref`` into the ``DIExpression`` in order to produce a
 valid debug record.
 
 .. note::
 
-   A DIExpression is interpreted in the same way regardless of which kind of
+   A ``DIExpression`` is interpreted in the same way regardless of which kind of
    debug record it's attached to.
 
-   DIExpressions are always printed and parsed inline; they can never be
+   ``DIExpressions`` are always printed and parsed inline; they can never be
    referenced by an ID (e.g. ``!1``).
 
 .. code-block:: text
@@ -6938,7 +6944,7 @@ DIArgList
 ``DIArgList`` nodes hold a list of constant or SSA value references. These are
 used in :ref:`debug records <debugrecords>` in combination with a
 ``DIExpression`` that uses the
-``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
+``DW_OP_LLVM_arg`` operator. Because a ``DIArgList`` may refer to local values
 within a function, it must only be used as a function argument, must always be
 inlined, and cannot appear in named metadata.
 
@@ -6956,7 +6962,7 @@ These flags encode various properties of DINodes.
 
 The `ExportSymbols` flag marks a class, struct or union whose members
 may be referenced as if they were defined in the containing class or
-union. This flag is used to decide whether the DW_AT_export_symbols can
+union. This flag is used to decide whether the ``DW_AT_export_symbols`` can
 be used for the structure type.
 
 DIObjCProperty
@@ -7441,7 +7447,7 @@ For example, in the code below, the call instruction may only target the
 
 ``callback`` metadata may be attached to a function declaration, or definition.
 (Call sites are excluded only due to the lack of a use case.) For ease of
-exposition, we'll refer to the function annotated w/ metadata as a broker
+exposition, we'll refer to the function annotated with metadata as a broker
 function. The metadata describes how the arguments of a call to the broker are
 in turn passed to the callback function specified by the metadata. Thus, the
 ``callback`` metadata provides a partial description of a call site inside the
@@ -7533,7 +7539,7 @@ sections that the user does not want removed after linking.
 
 ``unpredictable`` metadata may be attached to any branch or switch
 instruction. It can be used to express the unpredictability of control
-flow. Similar to the llvm.expect intrinsic, it may be used to alter
+flow. Similar to the ``llvm.expect`` intrinsic, it may be used to alter
 optimizations related to compare and branch instructions. The metadata
 is treated as a boolean value; if it exists, it signals that the branch
 or switch that it is attached to is completely unpredictable.
@@ -7610,7 +7616,7 @@ loop is transformed to a different loop before an explicitly requested
 other transformations impossible. Mandatory loop canonicalizations such
 as loop rotation are still applied.
 
-It is recommended to use this metadata in addition to any llvm.loop.*
+It is recommended to use this metadata in addition to any ``llvm.loop.*``
 transformation directive. Also, any loop should have at most one
 directive applied to it (and a sequence of transformations built using
 followup-attributes). Otherwise, which transformation will be applied
@@ -7956,7 +7962,7 @@ the non-distributed fallback version will have. See
 '``llvm.loop.distribute.followup_all``' Metadata
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-The attributes in this metadata is added to all followup loops of the
+The attributes in this metadata are added to all followup loops of the
 loop distribution pass. See
 :ref:`Transformation Metadata <transformation-metadata>` for details.
 
@@ -7971,7 +7977,7 @@ performed on this loop. The metadata has a single operand which is the string
 
    !0 = !{!"llvm.licm.disable"}
 
-Note that although it operates per loop it isn't given the llvm.loop prefix
+Note that although it operates per loop it isn't given the ``llvm.loop`` prefix
 as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
 
 '``llvm.access.group``' Metadata
@@ -8035,8 +8041,8 @@ undefined.
 Note that if not all memory access instructions belong to an access
 group referred to by ``llvm.loop.parallel_accesses``, then the loop must
 not be considered trivially parallel. Additional
-memory dependence analysis is required to make that determination. As a fail
-safe mechanism, this causes loops that were originally parallel to be considered
+memory dependence analysis is required to make that determination. As a
+fail-safe mechanism, this causes loops that were originally parallel to be considered
 sequential (if optimization passes that are unaware of the parallel semantics
 insert new memory instructions into the loop body).
 
@@ -8168,8 +8174,8 @@ Examples:
 
    !0 = !{}
 
-The invariant.group metadata must be dropped when replacing one pointer by
-another based on aliasing information. This is because invariant.group is tied
+The ``invariant.group`` metadata must be dropped when replacing one pointer by
+another based on aliasing information. This is because ``invariant.group`` is tied
 to the SSA value of the pointer operand.
 
 .. code-block:: llvm
@@ -8205,7 +8211,7 @@ compatibility, globals carrying this metadata should:
 - Be in ``@llvm.compiler.used``.
 - If the referenced global variable is in a comdat, be in the same comdat.
 
-``!associated`` can not express many-to-one relationship. A global variable with
+``!associated`` can not express a many-to-one relationship. A global variable with
 the metadata should generally not be referenced by a function: the function may
 be inlined into other functions, leading to more references to the metadata.
 Ideally we would want to keep metadata alive as long as any inline location is
@@ -8266,12 +8272,12 @@ VP
 
 VP (value profile) metadata can be attached to instructions that have
 value profile information. Currently this is indirect calls (where it
-records the hottest callees) and calls to memory intrinsics such as memcpy,
+records the hottest callees) and calls to memory intrinsics, such as memcpy,
 memmove, and memset (where it records the hottest byte lengths).
 
-Each VP metadata node contains "VP" string, then a uint32_t value for the value
-profiling kind, a uint64_t value for the total number of times the instruction
-is executed, followed by uint64_t value and execution count pairs.
+Each VP metadata node contains "VP" string, then a ``uint32_t`` value for the value
+profiling kind, a ``uint64_t`` value for the total number of times the instruction
+is executed, followed by ``uint64_t`` value and execution count pairs.
 The value profiling kind is 0 for indirect call targets and 1 for memory
 operations. For indirect call targets, each profile value is a hash
 of the callee function name, and for memory operations each value is the
@@ -8470,8 +8476,8 @@ Example:
 
 This is intended for use on targets with a notion of generic address
 spaces, which at runtime resolve to different physical memory
-spaces. The interpretation of the address space values is target
-specific. The behavior is undefined if the runtime memory address does
+spaces. The interpretation of the address space values is target specific.
+The behavior is undefined if the runtime memory address does
 resolve to an object defined in one of the indicated address spaces.
 
 
@@ -8482,7 +8488,7 @@ Information about the module as a whole is difficult to convey to LLVM's
 subsystems. The LLVM IR isn't sufficient to transmit this information.
 The ``llvm.module.flags`` named metadata exists in order to facilitate
 this. These flags are in the form of key / value pairs --- much like a
-dictionary --- making it easy for any subsystem who cares about a flag to
+dictionary --- making it easy for any subsystem that cares about a flag to
 look it up.
 
 The ``llvm.module.flags`` metadata contains a list of metadata triplets.
@@ -8742,7 +8748,7 @@ Automatic Linker Flags Named Metadata
 
 Some targets support embedding of flags to the linker inside individual object
 files. Typically this is used in conjunction with language extensions which
-allow source files to contain linker command line options, and have these
+allow source files to contain linker command-line options, and have these
 automatically be transmitted to the linker via object files.
 
 These flags are encoded in the IR using named metadata with the name
@@ -11733,7 +11739,7 @@ size of the '<value>' type. Note that this default alignment assumption is
 different from the alignment used for the load/store instructions when align
 isn't specified.
 
-A ``atomicrmw`` instruction can also take an optional
+An ``atomicrmw`` instruction can also take an optional
 ":ref:`syncscope <syncscope>`" argument.
 
 Semantics:
@@ -12504,7 +12510,7 @@ Semantics:
 """"""""""
 
 The '``ptrtoint``' instruction converts ``value`` to integer type
-``ty2`` by interpreting the all pointer representation bits as an integer
+``ty2`` by interpreting all the pointer representation bits as an integer
 (equivalent to a ``bitcast``) and either truncating or zero extending that value
 to the size of the integer type.
 If ``value`` is smaller than ``ty2`` then a zero extension is done. If
@@ -12523,6 +12529,59 @@ Example:
       %Y = ptrtoint ptr %P to i64                        ; yields zero extension on 32-bit architecture
       %Z = ptrtoint <4 x ptr> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
 
+.. _i_ptrtoaddr:
+
+'``ptrtoaddr .. to``' Instruction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+::
+
+      <result> = ptrtoaddr <ty> <value> to <ty2>             ; yields ty2
+
+Overview:
+"""""""""
+
+The '``ptrtoaddr``' instruction converts the pointer or a vector of
+pointers ``value`` to the underlying integer address (or vector of addresses) of
+type ``ty2``. This is different from :ref:`ptrtoint <i_ptrtoint>` in that it
+only operates on the index bits of the pointer and ignores all other bits, and
+does not capture the provenance of the pointer.
+
+Arguments:
+""""""""""
+
+The '``ptrtoaddr``' instruction takes a ``value`` to cast, which must be
+a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
+type to cast it to ``ty2``, which must be must be the :ref:`integer <t_integer>`
+type (or vector of integers) matching the pointer index width of the address
+space of ``ty``.
+
+Semantics:
+""""""""""
+
+The '``ptrtoaddr``' instruction converts ``value`` to integer type ``ty2`` by
+interpreting the lowest index-width pointer representation bits as an integer.
+If the address size and the pointer representation size are the same and
+``value`` and ``ty2`` are the same size, then nothing is done (*no-op cast*)
+other than a type change.
+
+The ``ptrtoaddr`` instruction always :ref:`captures the address but not the provenance <pointercapture>`
+of the pointer argument.
+
+Example:
+""""""""
+This example assumes pointers in address space 1 are 64 bits in size with an
+address width of 32 bits (``p1:64:64:64:32`` :ref:`datalayout string<langref_datalayout>`)
+
+.. code-block:: llvm
+
+      %X = ptrtoaddr ptr addrspace(1) %P to i32              ; extracts low 32 bits of pointer
+      %Y = ptrtoaddr <4 x ptr addrspace(1)> %P to <4 x i32>  ; yields vector of low 32 bits for each pointer
+
+
 .. _i_inttoptr:
 
 '``inttoptr .. to``' Instruction
@@ -13483,7 +13542,7 @@ ensures that each ``catchpad`` has exactly one predecessor block, and it always
 terminates in a ``catchswitch``.
 
 The ``args`` correspond to whatever information the personality routine
-requires to know if this is an appropriate handler for the exception. Control
+requires to determine if this is an appropriate handler for the exception. Control
 will transfer to the ``catchpad`` if this is the first appropriate handler for
 the exception.
 
@@ -13827,7 +13886,7 @@ Semantics:
 The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
 available in C. In a target-dependent way, it copies the source
 ``va_list`` element into the destination ``va_list`` element. This
-intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
+intrinsic is necessary because the ``llvm.va_start`` intrinsic may be
 arbitrarily complex and require, for example, memory allocation.
 
 Accurate Garbage Collection Intrinsics
@@ -14018,7 +14077,7 @@ types of the 'call parameters' arguments.
 
 The '#call args' operand is the number of arguments to the actual
 call.  It must exactly match the number of arguments passed in the
-'call parameters' variable length section.
+'call parameters' variable-length section.
 
 The 'flags' operand is used to specify extra information about the
 statepoint. This is currently only used to mark certain statepoints
@@ -14139,7 +14198,7 @@ so constructed.
 
 The third argument is an index which specify the (potentially) derived pointer
 being relocated.  It is legal for this index to be the same as the second
-argument if-and-only-if a base pointer is being relocated.
+argument if and only if a base pointer is being relocated.
 
 Semantics:
 """"""""""
@@ -14835,7 +14894,7 @@ Overview:
 """""""""
 
 The '``llvm.instrprof.increment``' intrinsic can be emitted by a
-frontend for use with instrumentation based profiling. These will be
+frontend for use with instrumentation-based profiling. These will be
 lowered by the ``-instrprof`` pass to generate execution counts of a
 program at runtime.
 
@@ -15038,7 +15097,7 @@ Overview:
 """""""""
 
 The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
-frontend for use with instrumentation based profiling. This will be
+frontend for use with instrumentation-based profiling. This will be
 lowered by the ``-instrprof`` pass to find out the target values,
 instrumented expressions take in a program at runtime.
 
@@ -15685,7 +15744,7 @@ external functions.
 Syntax:
 """""""
 
-This is an overloaded intrinsic. You can use llvm.memmove on any integer
+This is an overloaded intrinsic. You can use ``llvm.memmove`` on any integer
 bit width and for different address space. Not all targets support all
 bit widths however.
 
@@ -15746,7 +15805,7 @@ otherwise the behavior is undefined.
 Syntax:
 """""""
 
-This is an overloaded intrinsic. You can use llvm.memset on any integer
+This is an overloaded intrinsic. You can use ``llvm.memset`` on any integer
 bit width and for different address spaces. However, not all targets
 support all bit widths.
 
@@ -17935,7 +17994,7 @@ operate on a per-element basis and the element order is not affected.
 Syntax:
 """""""
 
-This is an overloaded intrinsic. You can use llvm.ctpop on any integer
+This is an overloaded intrinsic. You can use ``llvm.ctpop`` on any integer
 bit width, or on any vector with integer elements. Not all targets
 support all bit widths or vector types, however.
 
@@ -18455,7 +18514,7 @@ Overview:
 """""""""
 
 The '``llvm.umul.with.overflow``' family of intrinsic functions perform
-a unsigned multiplication of the two arguments, and indicate whether an
+an unsigned multiplication of the two arguments, and indicate whether an
 overflow occurred during the unsigned multiplication.
 
 Arguments:
@@ -20622,7 +20681,7 @@ Semantics:
 
 The '``llvm.experimental.vector.histogram.*``' intrinsics are used to perform
 updates on potentially overlapping values in memory. The intrinsics represent
-the follow sequence of operations:
+the following sequence of operations:
 
 1. Gather load from the ``ptrs`` operand, with element type matching that of
    the ``inc`` operand.
@@ -26355,7 +26414,7 @@ This is an overloaded intrinsic.
 Overview:
 """""""""
 
-Predicated llvm.is.fpclass :ref:`llvm.is.fpclass <llvm.is.fpclass>`
+Predicated ``llvm.is.fpclass`` :ref:`llvm.is.fpclass <llvm.is.fpclass>`
 
 Arguments:
 """"""""""
@@ -26370,7 +26429,7 @@ operation.
 Semantics:
 """"""""""
 
-The '``llvm.vp.is.fpclass``' intrinsic performs llvm.is.fpclass (:ref:`llvm.is.fpclass <llvm.is.fpclass>`).
+The '``llvm.vp.is.fpclass``' intrinsic performs ``llvm.is.fpclass`` (:ref:`llvm.is.fpclass <llvm.is.fpclass>`).
 
 
 Examples:
@@ -26730,7 +26789,7 @@ Syntax:
 
 ::
 
-      declare void @llvm.lifetime.start(i64 <size>, ptr captures(none) <ptr>)
+      declare void @llvm.lifetime.start(ptr captures(none) <ptr>)
 
 Overview:
 """""""""
@@ -26741,11 +26800,8 @@ object's lifetime.
 Arguments:
 """"""""""
 
-The first argument is a constant integer, which is ignored and will be removed
-in the future.
-
-The second argument is either a pointer to an ``alloca`` instruction or
-a ``poison`` value.
+The argument is either a pointer to an ``alloca`` instruction or a ``poison``
+value.
 
 Semantics:
 """"""""""
@@ -26774,7 +26830,7 @@ Syntax:
 
 ::
 
-      declare void @llvm.lifetime.end(i64 <size>, ptr captures(none) <ptr>)
+      declare void @llvm.lifetime.end(ptr captures(none) <ptr>)
 
 Overview:
 """""""""
@@ -26785,11 +26841,8 @@ The '``llvm.lifetime.end``' intrinsic specifies the end of a
 Arguments:
 """"""""""
 
-The first argument is a constant integer, which is ignored and will be removed
-in the future.
-
-The second argument is either a pointer to an ``alloca`` instruction or
-a ``poison`` value.
+The argument is either a pointer to an ``alloca`` instruction or a ``poison``
+value.
 
 Semantics:
 """"""""""
@@ -28440,7 +28493,7 @@ environment.  The rounding mode argument is only intended as information
 to the compiler.
 
 If the runtime floating-point environment is using the default rounding mode
-then the results will be the same as the llvm.lrint intrinsic.
+then the results will be the same as the ``llvm.lrint`` intrinsic.
 
 
 '``llvm.experimental.constrained.llrint``' Intrinsic
@@ -28488,7 +28541,7 @@ environment.  The rounding mode argument is only intended as information
 to the compiler.
 
 If the runtime floating-point environment is using the default rounding mode
-then the results will be the same as the llvm.llrint intrinsic.
+then the results will be the same as the ``llvm.llrint`` intrinsic.
 
 
 '``llvm.experimental.constrained.nearbyint``' Intrinsic
@@ -28949,7 +29002,7 @@ was only valid within a single iteration.
 
 .. code-block:: llvm
 
-  ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
+  ; This example shows two possible positions for noalias.decl and how they impact the semantics:
   ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
   ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
   declare void @decl_in_loop(ptr %a.base, ptr %b.base) {
@@ -30404,7 +30457,7 @@ has externally observable side effects.
 Syntax:
 """""""
 
-This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
+This is an overloaded intrinsic. You can use ``llvm.is.constant`` with any argument type.
 
 ::
 
diff --git a/llvm/docs/MergeFunctions.rst b/llvm/docs/MergeFunctions.rst
index 02344bc..c27f603 100644
--- a/llvm/docs/MergeFunctions.rst
+++ b/llvm/docs/MergeFunctions.rst
@@ -7,7 +7,7 @@ MergeFunctions pass, how it works
 
 Introduction
 ============
-Sometimes code contains equal functions, or functions that does exactly the same
+Sometimes code contains equal functions, or functions that do exactly the same
 thing even though they are non-equal on the IR level (e.g.: multiplication on 2
 and 'shl 1'). It could happen due to several reasons: mainly, the usage of
 templates and automatic code generators. Though, sometimes the user itself could
@@ -16,7 +16,7 @@ write the same thing twice :-)
 The main purpose of this pass is to recognize such functions and merge them.
 
 This document is the extension to pass comments and describes the pass logic. It
-describes the algorithm that is used in order to compare functions and
+describes the algorithm used to compare functions and
 explains how we could combine equal functions correctly to keep the module
 valid.
 
@@ -58,7 +58,7 @@ It's especially important to understand chapter 3 of tutorial:
 
 :doc:`tutorial/LangImpl03`
 
-The reader should also know how passes work in LLVM. They could use this
+The reader should also know how passes work in LLVM. They can use this
 article as a reference and start point here:
 
 :doc:`WritingAnLLVMPass`
@@ -68,7 +68,7 @@ debugging and bug-fixing.
 
 Narrative structure
 -------------------
-The article consists of three parts. The first part explains pass functionality
+This article consists of three parts. The first part explains pass functionality
 on the top-level. The second part describes the comparison procedure itself.
 The third part describes the merging process.
 
@@ -130,7 +130,7 @@ access lookup? The answer is: "yes".
 
 Random-access
 """""""""""""
-How it could this be done? Just convert each function to a number, and gather
+How can this be done? Just convert each function to a number, and gather
 all of them in a special hash-table. Functions with equal hashes are equal.
 Good hashing means, that every function part must be taken into account. That
 means we have to convert every function part into some number, and then add it
@@ -190,17 +190,17 @@ The algorithm is pretty simple:
 
 1. Put all module's functions into the *worklist*.
 
-2. Scan *worklist*'s functions twice: first enumerate only strong functions and
+2. Scan *worklist*'s functions twice: first, enumerate only strong functions and
 then only weak ones:
 
    2.1. Loop body: take a function from *worklist*  (call it *FCur*) and try to
    insert it into *FnTree*: check whether *FCur* is equal to one of functions
    in *FnTree*. If there *is* an equal function in *FnTree*
-   (call it *FExists*): merge function *FCur* with *FExists*. Otherwise add
+   (call it *FExists*): merge function *FCur* with *FExists*. Otherwise, add
    the function from the *worklist* to *FnTree*.
 
 3. Once the *worklist* scanning and merging operations are complete, check the
-*Deferred* list. If it is not empty: refill the *worklist* contents with
+*Deferred* list. If it is not empty, refill the *worklist* contents with
 *Deferred* list and redo step 2, if the *Deferred* list is empty, then exit
 from method.
 
@@ -249,14 +249,14 @@ Below, we will use the following operations:
 
 The rest of the article is based on *MergeFunctions.cpp* source code
 (found in *<llvm_dir>/lib/Transforms/IPO/MergeFunctions.cpp*). We would like
-to ask reader to keep this file open, so we could use it as a reference
+to ask the reader to keep this file open, so we could use it as a reference
 for further explanations.
 
 Now, we're ready to proceed to the next chapter and see how it works.
 
 Functions comparison
 ====================
-At first, let's define how exactly we compare complex objects.
+First, let's define exactly how we compare complex objects.
 
 Complex object comparison (function, basic-block, etc) is mostly based on its
 sub-object comparison results. It is similar to the next "tree" objects
@@ -307,7 +307,7 @@ to those we met later in function body (value we met first would be *less*).
 This is done by “``FunctionComparator::cmpValues(const Value*, const Value*)``”
 method (will be described a bit later).
 
-4. Function body comparison. As it written in method comments:
+4. Function body comparison. As written in method comments:
 
 “We do a CFG-ordered walk since the actual ordering of the blocks in the linked
 list is immaterial. Our walk starts at the entry block for both functions, then
@@ -477,7 +477,7 @@ Of course, we can combine insertion and comparison:
     = sn_mapR.insert(std::make_pair(Right, sn_mapR.size()));
   return cmpNumbers(LeftRes.first->second, RightRes.first->second);
 
-Let's look, how whole method could be implemented.
+Let's look at how the whole method could be implemented.
 
 1. We have to start with the bad news. Consider function self and
 cross-referencing cases:
@@ -519,7 +519,7 @@ the result of numbers comparison:
    if (LeftRes.first->second < RightRes.first->second) return -1;
    return 1;
 
-Now when *cmpValues* returns 0, we can proceed the comparison procedure.
+Now, when *cmpValues* returns 0, we can proceed with the comparison procedure.
 Otherwise, if we get (-1 or 1), we need to pass this result to the top level,
 and finish comparison procedure.
 
@@ -549,7 +549,7 @@ losslessly bitcasted to each other. The further explanation is modification of
    2.1.3.1. If types are vectors, compare their bitwidth using the
    *cmpNumbers*. If result is not 0, return it.
 
-   2.1.3.2. Different types, but not a vectors:
+   2.1.3.2. Different types, but not vectors:
 
    * if both of them are pointers, good for us, we can proceed to step 3.
    * if one of types is pointer, return result of *isPointer* flags
@@ -654,7 +654,7 @@ O(N*N) to O(log(N)).
 
 Merging process, mergeTwoFunctions
 ==================================
-Once *MergeFunctions* detected that current function (*G*) is equal to one that
+Once *MergeFunctions* detects that current function (*G*) is equal to one that
 were analyzed before (function *F*) it calls ``mergeTwoFunctions(Function*,
 Function*)``.
 
@@ -664,7 +664,7 @@ Operation affects ``FnTree`` contents with next way: *F* will stay in
 functions that calls *G* would be put into ``Deferred`` set and removed from
 ``FnTree``, and analyzed again.
 
-The approach is next:
+The approach is as follows:
 
 1. Most wished case: when we can use alias and both of *F* and *G* are weak. We
 make both of them with aliases to the third strong function *H*. Actually *H*
@@ -691,12 +691,12 @@ ok: we can use alias to *F* instead of *G* or change call instructions itself.
 
 HasGlobalAliases, removeUsers
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-First consider the case when we have global aliases of one function name to
+First, consider the case when we have global aliases of one function name to
 another. Our purpose is  make both of them with aliases to the third strong
 function. Though if we keep *F* alive and without major changes we can leave it
 in ``FnTree``. Try to combine these two goals.
 
-Do stub replacement of *F* itself with an alias to *F*.
+Do a stub replacement of *F* itself with an alias to *F*.
 
 1. Create stub function *H*, with the same name and attributes like function
 *F*. It takes maximum alignment of *F* and *G*.
@@ -725,7 +725,7 @@ also have alias to *F*.
 
 No global aliases, replaceDirectCallers
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-If global aliases are not supported. We call ``replaceDirectCallers``. Just
+If global aliases are not supported, we call ``replaceDirectCallers``. Just
 go through all calls of *G* and replace it with calls of *F*. If you look into
 the method you will see that it scans all uses of *G* too, and if use is callee
 (if user is call instruction and *G* is used as what to be called), we replace
diff --git a/llvm/docs/NVPTXUsage.rst b/llvm/docs/NVPTXUsage.rst
index d28eb68..2dc8f9f 100644
--- a/llvm/docs/NVPTXUsage.rst
+++ b/llvm/docs/NVPTXUsage.rst
@@ -971,6 +971,10 @@ Syntax:
   declare void  @llvm.nvvm.prefetch.L1(ptr %ptr)
   declare void  @llvm.nvvm.prefetch.L2(ptr %ptr)
   
+  declare void  @llvm.nvvm.prefetch.tensormap.p0(ptr %ptr)
+  declare void  @llvm.nvvm.prefetch.tensormap.p4(ptr addrspace(4) %const_ptr)
+  declare void  @llvm.nvvm.prefetch.tensormap.p101(ptr addrspace(101) %param_ptr)  
+  
   declare void  @llvm.nvvm.prefetch.global.L2.evict.normal(ptr addrspace(1) %global_ptr)
   declare void  @llvm.nvvm.prefetch.global.L2.evict.last(ptr addrspace(1) %global_ptr)
 
@@ -983,7 +987,10 @@ The '``@llvm.nvvm.prefetch.*``' and '``@llvm.nvvm.prefetchu.*``' intrinsic
 correspond to the '``prefetch.*``;' and '``prefetchu.*``' family of PTX instructions. 
 The '``prefetch.*``' instructions bring the cache line containing the
 specified address in global or local memory address space into the 
-specified cache level (L1 or L2). The '`prefetchu.*``' instruction brings the cache line 
+specified cache level (L1 or L2). If the '``.tensormap``' qualifier is specified then the 
+prefetch instruction brings the cache line containing the specified address in the 
+'``.const``' or '``.param memory``' state space for subsequent use by the '``cp.async.bulk.tensor``' 
+instruction. The '`prefetchu.*``' instruction brings the cache line 
 containing the specified generic address into the specified uniform cache level.
 If no address space is specified, it is assumed to be generic address. The intrinsic 
 uses and eviction priority which can be accessed by the '``.level::eviction_priority``' modifier.
diff --git a/llvm/docs/RISCVUsage.rst b/llvm/docs/RISCVUsage.rst
index a29e06c..f9f3e39 100644
--- a/llvm/docs/RISCVUsage.rst
+++ b/llvm/docs/RISCVUsage.rst
@@ -531,6 +531,10 @@ The current vendor extensions supported are:
 ``XAndesVDot``
   LLVM implements `version 5.0.0 of the Andes Vector Dot Product Extension specification <https://github.com/andestech/andes-v5-isa/releases/download/ast-v5_4_0-release/AndeStar_V5_ISA_Spec_UM165-v1.5.08-20250317.pdf>`__ by Andes Technology. All instructions are prefixed with `nds.` as described in the specification.
 
+``XSMTVDot``
+  SpacemiT defines `Intrinsic Matrix Extension (IME) specification <https://github.com/space-mit/riscv-ime-extension-spec/releases/tag/v0429>`__.
+  LLVM implement the hardware-adapted subset for SpacemiT X60, defined in the `feature document <https://developer.spacemit.com/documentation?token=BWbGwbx7liGW21kq9lucSA6Vnpb#2.1>`__ by SpacemiT. All instructions are prefixed with `smt.` as described in the implementation guide. Note that this implemented subset is `version 1.0.0 of the SpacemiT Vector Dot Product Extension specification`, which is strictly a subset of the full IME specification to reflect the capabilities of SpacemiT X60 hardware correctly.
+
 Experimental C Intrinsics
 =========================
 
diff --git a/llvm/docs/Reference.rst b/llvm/docs/Reference.rst
index 35a6f59..7d0fdd7 100644
--- a/llvm/docs/Reference.rst
+++ b/llvm/docs/Reference.rst
@@ -17,6 +17,7 @@ LLVM and API reference documentation.
    CalleeTypeMetadata
    CIBestPractices
    CommandGuide/index
+   ContentAddressableStorage
    ConvergenceAndUniformity
    ConvergentOperations
    Coroutines
@@ -244,3 +245,6 @@ Additional Topics
 :doc:`MLGO`
    Facilities for ML-Guided Optimization, such as collecting IR corpora from a
    build, interfacing with ML models, an exposing features for training.
+
+:doc:`ContentAddressableStorage`
+   A reference guide for using LLVM's CAS library.
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index 0c49fc8..3b90c964 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -1,6 +1,9 @@
 <!-- This document is written in Markdown and uses extra directives provided by
 MyST (https://myst-parser.readthedocs.io/en/latest/). -->
 
+<!-- If you want to modify sections/contents permanently, you should modify both
+ReleaseNotes.md and ReleaseNotesTemplate.txt. -->
+
 LLVM {{env.config.release}} Release Notes
 =========================================
 
@@ -56,6 +59,10 @@ Makes programs 10x faster by doing Special New Thing.
 Changes to the LLVM IR
 ----------------------
 
+* The `ptrtoaddr` instruction was introduced. This instruction returns the
+  address component of a pointer type variable but unlike `ptrtoint` does not
+  capture provenance ([#125687](https://github.com/llvm/llvm-project/pull/125687)).
+
 Changes to LLVM infrastructure
 ------------------------------
 
@@ -73,6 +80,7 @@ Changes to Vectorizers
 
 * Added initial support for copyable elements in SLP, which models copyable
   elements as add <element>, 0, i.e. uses identity constants for missing lanes.
+* SLP vectorizer supports initial recognition of FMA/FMAD pattern
 
 Changes to the AArch64 Backend
 ------------------------------
@@ -104,6 +112,14 @@ Changes to the PowerPC Backend
 Changes to the RISC-V Backend
 -----------------------------
 
+* The loop vectorizer now performs tail folding by default on RISC-V, which
+  removes the need for a scalar epilogue loop. To restore the previous behaviour
+  use `-prefer-predicate-over-epilogue=scalar-epilogue`.
+* `llvm-objdump` now has basic support for switching between disassembling code
+  and data using mapping symbols such as `$x` and `$d`. Switching architectures
+  using `$x` with an architecture string suffix is not yet supported.
+* Ssctr and Smctr extensions are no longer experimental.
+
 Changes to the WebAssembly Backend
 ----------------------------------
 
diff --git a/llvm/docs/ReleaseNotesTemplate.txt b/llvm/docs/ReleaseNotesTemplate.txt
new file mode 100644
index 0000000..ea1e490
--- /dev/null
+++ b/llvm/docs/ReleaseNotesTemplate.txt
@@ -0,0 +1,163 @@
+<!-- This document is written in Markdown and uses extra directives provided by
+MyST (https://myst-parser.readthedocs.io/en/latest/). -->
+
+<!-- If you want to modify sections/contents permanently, you should modify both
+ReleaseNotes.md and ReleaseNotesTemplate.txt. -->
+
+LLVM {{env.config.release}} Release Notes
+=========================================
+
+```{contents}
+```
+
+````{only} PreRelease
+```{warning} These are in-progress notes for the upcoming LLVM {{env.config.release}}
+             release. Release notes for previous releases can be found on
+             [the Download Page](https://releases.llvm.org/download.html).
+```
+````
+
+Introduction
+============
+
+This document contains the release notes for the LLVM Compiler Infrastructure,
+release {{env.config.release}}.  Here we describe the status of LLVM, including
+major improvements from the previous release, improvements in various subprojects
+of LLVM, and some of the current users of the code.  All LLVM releases may be
+downloaded from the [LLVM releases web site](https://llvm.org/releases/).
+
+For more information about LLVM, including information about the latest
+release, please check out the [main LLVM web site](https://llvm.org/).  If you
+have questions or comments, the [Discourse forums](https://discourse.llvm.org)
+is a good place to ask them.
+
+Note that if you are reading this file from a Git checkout or the main
+LLVM web page, this document applies to the *next* release, not the current
+one.  To see the release notes for a specific release, please see the
+[releases page](https://llvm.org/releases/).
+
+Non-comprehensive list of changes in this release
+=================================================
+
+<!-- For small 1-3 sentence descriptions, just add an entry at the end of
+this list. If your description won't fit comfortably in one bullet
+point (e.g. maybe you would like to give an example of the
+functionality, or simply have a lot to talk about), see the comment below
+for adding a new subsection. -->
+
+* ...
+
+<!-- If you would like to document a larger change, then you can add a
+subsection about it right here. You can copy the following boilerplate:
+
+Special New Feature
+-------------------
+
+Makes programs 10x faster by doing Special New Thing.
+-->
+
+Changes to the LLVM IR
+----------------------
+
+Changes to LLVM infrastructure
+------------------------------
+
+Changes to building LLVM
+------------------------
+
+Changes to TableGen
+-------------------
+
+Changes to Interprocedural Optimizations
+----------------------------------------
+
+Changes to Vectorizers
+----------------------
+
+Changes to the AArch64 Backend
+------------------------------
+
+Changes to the AMDGPU Backend
+-----------------------------
+
+Changes to the ARM Backend
+--------------------------
+
+Changes to the AVR Backend
+--------------------------
+
+Changes to the DirectX Backend
+------------------------------
+
+Changes to the Hexagon Backend
+------------------------------
+
+Changes to the LoongArch Backend
+--------------------------------
+
+Changes to the MIPS Backend
+---------------------------
+
+Changes to the PowerPC Backend
+------------------------------
+
+Changes to the RISC-V Backend
+-----------------------------
+
+Changes to the WebAssembly Backend
+----------------------------------
+
+Changes to the Windows Target
+-----------------------------
+
+Changes to the X86 Backend
+--------------------------
+
+Changes to the OCaml bindings
+-----------------------------
+
+Changes to the Python bindings
+------------------------------
+
+Changes to the C API
+--------------------
+
+Changes to the CodeGen infrastructure
+-------------------------------------
+
+Changes to the Metadata Info
+----------------------------
+
+Changes to the Debug Info
+-------------------------
+
+Changes to the LLVM tools
+-------------------------
+
+Changes to LLDB
+---------------
+
+Changes to BOLT
+---------------
+
+Changes to Sanitizers
+---------------------
+
+Other Changes
+-------------
+
+External Open Source Projects Using LLVM {{env.config.release}}
+===============================================================
+
+Additional Information
+======================
+
+A wide variety of additional information is available on the
+[LLVM web page](https://llvm.org/), in particular in the
+[documentation](https://llvm.org/docs/) section.  The web page also contains
+versions of the API documentation which is up-to-date with the Git version of
+the source code.  You can access versions of these documents specific to this
+release by going into the `llvm/docs/` directory in the LLVM tree.
+
+If you have any questions or comments about LLVM, please feel free to contact
+us via the [Discourse forums](https://discourse.llvm.org).
diff --git a/llvm/docs/SourceLevelDebugging.rst b/llvm/docs/SourceLevelDebugging.rst
index dfc8c53e..ea27ee5b 100644
--- a/llvm/docs/SourceLevelDebugging.rst
+++ b/llvm/docs/SourceLevelDebugging.rst
@@ -34,7 +34,7 @@ important ones are:
   the source-level-language.
 
 * Source-level languages are often **widely** different from one another.
-  LLVM should not put any restrictions of the flavor of the source-language,
+  LLVM should not put any restrictions on the flavor of the source-language,
   and the debugging information should work with any language.
 
 * With code generator support, it should be possible to use an LLVM compiler
@@ -74,10 +74,10 @@ from and inspired by DWARF, but it is feasible to translate into other target
 debug info formats such as STABS.
 
 SamplePGO (also known as `AutoFDO <https://gcc.gnu.org/wiki/AutoFDO>`_)
-is a variant of profile guided optimizations which uses hardware sampling based
+is a variant of profile-guided optimizations which uses hardware sampling based
 profilers to collect branch frequency data with low overhead in production
 environments. It relies on debug information to associate profile information
-to LLVM IR which is then used to guide optimization heuristics. Maintaining
+with LLVM IR which is then used to guide optimization heuristics. Maintaining
 deterministic and distinct source locations is necessary to maximize the
 accuracy of mapping hardware sample counts to LLVM IR. For example, DWARF
 `discriminators <https://wiki.dwarfstd.org/Path_Discriminators.md>`_ allow
@@ -334,7 +334,7 @@ performs the assignment, and the destination address.
 The first three arguments are the same as for a ``#dbg_value``. The fourth
 argument is a ``DIAssignID`` used to reference a store. The fifth is the
 destination of the store, the sixth is a `complex
-expression <LangRef.html#diexpression>`_ that modfies it, and the seventh is a
+expression <LangRef.html#diexpression>`_ that modifies it, and the seventh is a
 `source location <LangRef.html#dilocation>`_.
 
 See :doc:`AssignmentTracking` for more info.
@@ -512,7 +512,7 @@ Here ``!13`` is metadata providing `location information
 information parameter to the records indicates that the variable ``X`` is
 declared at line number 2 at a function level scope in function ``foo``.
 
-Now lets take another example.
+Now, let's take another example.
 
 .. code-block:: llvm
 
@@ -532,14 +532,14 @@ Here ``!18`` indicates that ``Z`` is declared at line number 5 and column
 number 11 inside of lexical scope ``!17``.  The lexical scope itself resides
 inside of subprogram ``!4`` described above.
 
-The scope information attached with each instruction provides a straightforward
+The scope information attached to each instruction provides a straightforward
 way to find instructions covered by a scope.
 
 Object lifetime in optimized code
 =================================
 
 In the example above, every variable assignment uniquely corresponds to a
-memory store to the variable's position on the stack. However in heavily
+memory store to the variable's position on the stack. However, in heavily
 optimized code LLVM promotes most variables into SSA values, which can
 eventually be placed in physical registers or memory locations. To track SSA
 values through compilation, when objects are promoted to SSA values a
@@ -628,7 +628,7 @@ perhaps, be optimized into the following code:
   }
 
 What ``#dbg_value`` records should be placed to represent the original variable
-locations in this code? Unfortunately the second, third and fourth
+locations in this code? Unfortunately the second, third, and fourth
 #dbg_values for ``!1`` in the source function have had their operands
 (%tval, %fval, %merge) optimized out. Assuming we cannot recover them, we
 might consider this placement of #dbg_values:
@@ -696,7 +696,7 @@ How variable location metadata is transformed during CodeGen
 LLVM preserves debug information throughout mid-level and backend passes,
 ultimately producing a mapping between source-level information and
 instruction ranges. This
-is relatively straightforwards for line number information, as mapping
+is relatively straightforward for line number information, as mapping
 instructions to line numbers is a simple association. For variable locations
 however the story is more complex. As each ``#dbg_value`` record
 represents a source-level assignment of a value to a source variable, the
@@ -710,7 +710,7 @@ location fidelity are:
 2. Register allocation
 3. Block layout
 
-each of which are discussed below. In addition, instruction scheduling can
+each of which is discussed below. In addition, instruction scheduling can
 significantly change the ordering of the program, and occurs in a number of
 different passes.
 
@@ -782,13 +782,13 @@ And has the following operands:
    location operands, which may take any of the same values as the first
    operand of the ``DBG_VALUE`` instruction above. These variable location
    operands are inserted into the final DWARF Expression in positions indicated
-   by the DW_OP_LLVM_arg operator in the `DIExpression
+   by the ``DW_OP_LLVM_arg`` operator in the `DIExpression
    <LangRef.html#diexpression>`_.
 
 The position at which the DBG_VALUEs are inserted should correspond to the
 positions of their matching ``#dbg_value`` records in the IR block.  As
 with optimization, LLVM aims to preserve the order in which variable
-assignments occurred in the source program. However SelectionDAG performs some
+assignments occurred in the source program. However, SelectionDAG performs some
 instruction scheduling, which can reorder assignments (discussed below).
 Function parameter locations are moved to the beginning of the function if
 they're not already, to ensure they're immediately available on function entry.
@@ -855,19 +855,19 @@ If one compiles this IR with ``llc -o - -start-after=codegen-prepare -stop-after
     $eax = COPY %8, debug-location !5
     RET 0, $eax, debug-location !5
 
-Observe first that there is a DBG_VALUE instruction for every ``#dbg_value``
+Observe first that there is a ``DBG_VALUE`` instruction for every ``#dbg_value``
 record in the source IR, ensuring no source level assignments go missing.
 Then consider the different ways in which variable locations have been recorded:
 
 * For the first #dbg_value an immediate operand is used to record a zero value.
-* The #dbg_value of the PHI instruction leads to a DBG_VALUE of virtual register
+* The #dbg_value of the PHI instruction leads to a ``DBG_VALUE`` of virtual register
   ``%0``.
 * The first GEP has its effect folded into the first load instruction
   (as a 4-byte offset), but the variable location is salvaged by folding
-  the GEPs effect into the DIExpression.
+  the GEPs effect into the ``DIExpression``.
 * The second GEP is also folded into the corresponding load. However, it is
   insufficiently simple to be salvaged, and is emitted as a ``$noreg``
-  DBG_VALUE, indicating that the variable takes on an undefined location.
+  ``DBG_VALUE``, indicating that the variable takes on an undefined location.
 * The final #dbg_value has its Value placed in virtual register ``%1``.
 
 Instruction Scheduling
@@ -880,7 +880,7 @@ case the instruction sequence could be completely reversed. In such
 circumstances LLVM follows the principle applied to optimizations, that it is
 better for the debugger not to display any state than a misleading state.
 Thus, whenever instructions are advanced in order of execution, any
-corresponding DBG_VALUE is kept in its original position, and if an instruction
+corresponding ``DBG_VALUE`` is kept in its original position, and if an instruction
 is delayed then the variable is given an undefined location for the duration
 of the delay. To illustrate, consider this pseudo-MIR:
 
@@ -893,7 +893,7 @@ of the delay. To illustrate, consider this pseudo-MIR:
   %7:gr32 = SUB32rr %6, %5, implicit-def dead $eflags
   DBG_VALUE %7, $noreg, !5, !6
 
-Imagine that the SUB32rr were moved forward to give us the following MIR:
+Imagine that the ``SUB32rr`` were moved forward to give us the following MIR:
 
 .. code-block:: text
 
@@ -905,13 +905,13 @@ Imagine that the SUB32rr were moved forward to give us the following MIR:
   DBG_VALUE %7, $noreg, !5, !6
 
 In this circumstance LLVM would leave the MIR as shown above. Were we to move
-the DBG_VALUE of virtual register %7 upwards with the SUB32rr, we would re-order
+the ``DBG_VALUE`` of virtual register %7 upwards with the ``SUB32rr``, we would re-order
 assignments and introduce a new state of the program. Whereas with the solution
 above, the debugger will see one fewer combination of variable values, because
 ``!3`` and ``!5`` will change value at the same time. This is preferred over
 misrepresenting the original program.
 
-In comparison, if one sunk the MOV32rm, LLVM would produce the following:
+In comparison, if one sunk the ``MOV32rm``, LLVM would produce the following:
 
 .. code-block:: text
 
@@ -924,10 +924,10 @@ In comparison, if one sunk the MOV32rm, LLVM would produce the following:
   DBG_VALUE %1, $noreg, !1, !2
 
 Here, to avoid presenting a state in which the first assignment to ``!1``
-disappears, the DBG_VALUE at the top of the block assigns the variable the
+disappears, the ``DBG_VALUE`` at the top of the block assigns the variable the
 undefined location, until its value is available at the end of the block where
-an additional DBG_VALUE is added. Were any other DBG_VALUE for ``!1`` to occur
-in the instructions that the MOV32rm was sunk past, the DBG_VALUE for ``%1``
+an additional ``DBG_VALUE`` is added. Were any other ``DBG_VALUE`` for ``!1`` to occur
+in the instructions that the ``MOV32rm`` was sunk past, the ``DBG_VALUE`` for ``%1``
 would be dropped and the debugger would never observe it in the variable. This
 accurately reflects that the value is not available during the corresponding
 portion of the original program.
@@ -937,13 +937,13 @@ Variable locations during Register Allocation
 
 To avoid debug instructions interfering with the register allocator, the
 LiveDebugVariables pass extracts variable locations from a MIR function and
-deletes the corresponding DBG_VALUE instructions. Some localized copy
+deletes the corresponding ``DBG_VALUE`` instructions. Some localized copy
 propagation is performed within blocks. After register allocation, the
-VirtRegRewriter pass re-inserts DBG_VALUE instructions in their original
+VirtRegRewriter pass re-inserts ``DBG_VALUE`` instructions in their original
 positions, translating virtual register references into their physical
 machine locations. To avoid encoding incorrect variable locations, in this
-pass any DBG_VALUE of a virtual register that is not live, is replaced by
-the undefined location. The LiveDebugVariables may insert redundant DBG_VALUEs
+pass any ``DBG_VALUE`` of a virtual register that is not live, is replaced by
+the undefined location. The LiveDebugVariables may insert redundant ``DBG_VALUE``'s
 because of virtual register rewriting. These will be subsequently removed by
 the RemoveRedundantDebugValues pass.
 
@@ -956,11 +956,11 @@ LiveDebugValues pass runs to achieve two aims:
 * To propagate the location of variables through copies and register spills,
 * For every block, to record every valid variable location in that block.
 
-After this pass the DBG_VALUE instruction changes meaning: rather than
+After this pass the ``DBG_VALUE`` instruction changes meaning: rather than
 corresponding to a source-level assignment where the variable may change value,
 it asserts the location of a variable in a block, and loses effect outside the
 block. Propagating variable locations through copies and spills is
-straightforwards: determining the variable location in every basic block
+straightforward: determining the variable location in every basic block
 requires the consideration of control flow. Consider the following IR, which
 presents several difficulties:
 
@@ -1021,9 +1021,9 @@ predecessors then that location is propagated into the successor. If the
 predecessor locations disagree, the location becomes undefined.
 
 Once LiveDebugValues has run, every block should have all valid variable
-locations described by DBG_VALUE instructions within the block. Very little
+locations described by ``DBG_VALUE`` instructions within the block. Very little
 effort is then required by supporting classes (such as
-DbgEntityHistoryCalculator) to build a map of each instruction to every
+``DbgEntityHistoryCalculator``) to build a map of each instruction to every
 valid variable location, without the need to consider control flow. From
 the example above, it is otherwise difficult to determine that the location
 of variable ``!30`` should flow "up" into block ``%bb1``, but that the location
@@ -1057,7 +1057,7 @@ helper functions in ``lib/IR/DIBuilder.cpp``.
 C/C++ source file information
 -----------------------------
 
-``llvm::Instruction`` provides easy access to metadata attached with an
+``llvm::Instruction`` provides easy access to metadata attached to an
 instruction.  One can extract line number information encoded in LLVM IR using
 ``Instruction::getDebugLoc()`` and ``DILocation::getLine()``.
 
@@ -1081,7 +1081,7 @@ added by the front-end but doesn't correspond to source code written by the user
   }
 
 At the end of the scope the MyObject's destructor is called but it isn't written
-explicitly. This information is useful to avoid to have counters on brackets when
+explicitly. This information is useful to avoid having counters on brackets when
 making code coverage.
 
 C/C++ global variable information
@@ -1147,11 +1147,11 @@ a C/C++ front-end would generate the following descriptors:
   !8 = !{!"clang version 4.0.0"}
 
 
-The align value in DIGlobalVariable description specifies variable alignment in
-case it was forced by C11 _Alignas(), C++11 alignas() keywords or compiler
-attribute __attribute__((aligned ())). In other case (when this field is missing)
+The align value in ``DIGlobalVariable`` description specifies variable alignment in
+case it was forced by C11 ``_Alignas()``, C++11 ``alignas()`` keywords or compiler
+attribute ``__attribute__((aligned ()))``. In other case (when this field is missing)
 alignment is considered default. This is used when producing DWARF output
-for DW_AT_alignment value.
+for ``DW_AT_alignment`` value.
 
 C/C++ function information
 --------------------------
@@ -1200,7 +1200,7 @@ Given a class declaration with copy constructor declared as deleted:
      foo(const foo&) = deleted;
   };
 
-A C++ frontend would generate following:
+A C++ frontend would generate the following:
 
 .. code-block:: text
 
@@ -1247,7 +1247,7 @@ and this will materialize an additional DWARF attribute as:
      ...
      DW_AT_elemental [DW_FORM_flag_present]  (true)
 
-There are a few DWARF tags defined to represent Fortran specific constructs i.e DW_TAG_string_type for representing Fortran character(n). In LLVM this is represented as DIStringType.
+There are a few DWARF tags defined to represent Fortran specific constructs i.e ``DW_TAG_string_type`` for representing Fortran character(n). In LLVM, this is represented as ``DIStringType``.
 
 .. code-block:: fortran
 
@@ -1260,7 +1260,7 @@ a Fortran front-end would generate the following descriptors:
   !DILocalVariable(name: "string", arg: 1, scope: !10, file: !3, line: 4, type: !15)
   !DIStringType(name: "character(*)!2", stringLength: !16, stringLengthExpression: !DIExpression(), size: 32)
 
-A fortran deferred-length character can also contain the information of raw storage of the characters in addition to the length of the string. This information is encoded in the  stringLocationExpression field. Based on this information, DW_AT_data_location attribute is emitted in a DW_TAG_string_type debug info.
+A fortran deferred-length character can also contain the information of raw storage of the characters in addition to the length of the string. This information is encoded in the  stringLocationExpression field. Based on this information, ``DW_AT_data_location`` attribute is emitted in a ``DW_TAG_string_type`` debug info.
 
   !DIStringType(name: "character(*)!2", stringLengthExpression: !DIExpression(), stringLocationExpression: !DIExpression(DW_OP_push_object_address, DW_OP_deref), size: 32)
 
@@ -1300,28 +1300,28 @@ calls. This descriptor results in the following DWARF tag:
 Debugging information format
 ============================
 
-Debugging Information Extension for Objective C Properties
+Debugging Information Extension for Objective-C Properties
 ----------------------------------------------------------
 
 Introduction
 ^^^^^^^^^^^^
 
-Objective C provides a simpler way to declare and define accessor methods using
+Objective-C provides a simpler way to declare and define accessor methods using
 declared properties.  The language provides features to declare a property and
 to let compiler synthesize accessor methods.
 
-The debugger lets developer inspect Objective C interfaces and their instance
+The debugger lets developers inspect Objective-C interfaces and their instance
 variables and class variables.  However, the debugger does not know anything
-about the properties defined in Objective C interfaces.  The debugger consumes
+about the properties defined in Objective-C interfaces.  The debugger consumes
 information generated by compiler in DWARF format.  The format does not support
-encoding of Objective C properties.  This proposal describes DWARF extensions to
-encode Objective C properties, which the debugger can use to let developers
-inspect Objective C properties.
+encoding of Objective-C properties.  This proposal describes DWARF extensions to
+encode Objective-C properties, which the debugger can use to let developers
+inspect Objective-C properties.
 
 Proposal
 ^^^^^^^^
 
-Objective C properties exist separately from class members.  A property can be
+Objective-C properties exist separately from class members.  A property can be
 defined only by "setter" and "getter" selectors, and be calculated anew on each
 access.  Or a property can just be a direct access to some declared ivar.
 Finally it can have an ivar "automatically synthesized" for it by the compiler,
@@ -1397,7 +1397,7 @@ don't need to know this convention, since we are given the name of the ivar
 directly.
 
 Also, it is common practice in ObjC to have different property declarations in
-the @interface and @implementation - e.g. to provide a read-only property in
+the ``@interface`` and ``@implementation`` - e.g. to provide a read-only property in
 the interface, and a read-write interface in the implementation.  In that case,
 the compiler should emit whichever property declaration will be in force in the
 current translation unit.
@@ -1624,24 +1624,24 @@ The BUCKETS are an array of offsets to DATA for each hash:
 
 So for ``bucket[3]`` in the example above, we have an offset into the table
 0x000034f0 which points to a chain of entries for the bucket.  Each bucket must
-contain a next pointer, full 32 bit hash value, the string itself, and the data
+contain a next pointer, full 32-bit hash value, the string itself, and the data
 for the current string value.
 
 .. code-block:: none
 
               .------------.
   0x000034f0: | 0x00003500 | next pointer
-              | 0x12345678 | 32 bit hash
+              | 0x12345678 | 32-bit hash
               | "erase"    | string value
               | data[n]    | HashData for this bucket
               |------------|
   0x00003500: | 0x00003550 | next pointer
-              | 0x29273623 | 32 bit hash
+              | 0x29273623 | 32-bit hash
               | "dump"     | string value
               | data[n]    | HashData for this bucket
               |------------|
   0x00003550: | 0x00000000 | next pointer
-              | 0x82638293 | 32 bit hash
+              | 0x82638293 | 32-bit hash
               | "main"     | string value
               | data[n]    | HashData for this bucket
               `------------'
@@ -1650,17 +1650,17 @@ The problem with this layout for debuggers is that we need to optimize for the
 negative lookup case where the symbol we're searching for is not present.  So
 if we were to lookup "``printf``" in the table above, we would make a 32-bit
 hash for "``printf``", it might match ``bucket[3]``.  We would need to go to
-the offset 0x000034f0 and start looking to see if our 32 bit hash matches.  To
+the offset 0x000034f0 and start looking to see if our 32-bit hash matches.  To
 do so, we need to read the next pointer, then read the hash, compare it, and
 skip to the next bucket.  Each time we are skipping many bytes in memory and
-touching new pages just to do the compare on the full 32 bit hash.  All of
+touching new pages just to do the compare on the full 32-bit hash.  All of
 these accesses then tell us that we didn't have a match.
 
 Name Hash Tables
 """"""""""""""""
 
-To solve the issues mentioned above we have structured the hash tables a bit
-differently: a header, buckets, an array of all unique 32 bit hash values,
+To solve the issues mentioned above, we have structured the hash tables a bit
+differently: a header, buckets, an array of all unique 32-bit hash values,
 followed by an array of hash value data offsets, one for each hash value, then
 the data for all hash values:
 
@@ -1679,11 +1679,11 @@ the data for all hash values:
   `-------------'
 
 The ``BUCKETS`` in the name tables are an index into the ``HASHES`` array.  By
-making all of the full 32 bit hash values contiguous in memory, we allow
+making all of the full 32-bit hash values contiguous in memory, we allow
 ourselves to efficiently check for a match while touching as little memory as
-possible.  Most often checking the 32 bit hash values is as far as the lookup
+possible.  Most often checking the 32-bit hash values is as far as the lookup
 goes.  If it does match, it usually is a match with no collisions.  So for a
-table with "``n_buckets``" buckets, and "``n_hashes``" unique 32 bit hash
+table with "``n_buckets``" buckets, and "``n_hashes``" unique 32-bit hash
 values, we can clarify the contents of the ``BUCKETS``, ``HASHES`` and
 ``OFFSETS`` as:
 
@@ -1698,16 +1698,16 @@ values, we can clarify the contents of the ``BUCKETS``, ``HASHES`` and
   |  HEADER.header_data_len | uint32_t
   |  HEADER_DATA            | HeaderData
   |-------------------------|
-  |  BUCKETS                | uint32_t[n_buckets] // 32 bit hash indexes
+  |  BUCKETS                | uint32_t[n_buckets] // 32-bit hash indexes
   |-------------------------|
-  |  HASHES                 | uint32_t[n_hashes] // 32 bit hash values
+  |  HASHES                 | uint32_t[n_hashes] // 32-bit hash values
   |-------------------------|
-  |  OFFSETS                | uint32_t[n_hashes] // 32 bit offsets to hash value data
+  |  OFFSETS                | uint32_t[n_hashes] // 32-bit offsets to hash value data
   |-------------------------|
   |  ALL HASH DATA          |
   `-------------------------'
 
-So taking the exact same data from the standard hash example above we end up
+So taking the exact same data from the standard hash example above, we end up
 with:
 
 .. code-block:: none
@@ -1761,7 +1761,7 @@ with:
               |            |
               |------------|
   0x000034f0: | 0x00001203 | .debug_str ("erase")
-              | 0x00000004 | A 32 bit array count - number of HashData with name "erase"
+              | 0x00000004 | A 32-bit array count - number of HashData with name "erase"
               | 0x........ | HashData[0]
               | 0x........ | HashData[1]
               | 0x........ | HashData[2]
@@ -1769,18 +1769,18 @@ with:
               | 0x00000000 | String offset into .debug_str (terminate data for hash)
               |------------|
   0x00003500: | 0x00001203 | String offset into .debug_str ("collision")
-              | 0x00000002 | A 32 bit array count - number of HashData with name "collision"
+              | 0x00000002 | A 32-bit array count - number of HashData with name "collision"
               | 0x........ | HashData[0]
               | 0x........ | HashData[1]
               | 0x00001203 | String offset into .debug_str ("dump")
-              | 0x00000003 | A 32 bit array count - number of HashData with name "dump"
+              | 0x00000003 | A 32-bit array count - number of HashData with name "dump"
               | 0x........ | HashData[0]
               | 0x........ | HashData[1]
               | 0x........ | HashData[2]
               | 0x00000000 | String offset into .debug_str (terminate data for hash)
               |------------|
   0x00003550: | 0x00001203 | String offset into .debug_str ("main")
-              | 0x00000009 | A 32 bit array count - number of HashData with name "main"
+              | 0x00000009 | A 32-bit array count - number of HashData with name "main"
               | 0x........ | HashData[0]
               | 0x........ | HashData[1]
               | 0x........ | HashData[2]
@@ -1795,13 +1795,13 @@ with:
 
 So we still have all of the same data, we just organize it more efficiently for
 debugger lookup.  If we repeat the same "``printf``" lookup from above, we
-would hash "``printf``" and find it matches ``BUCKETS[3]`` by taking the 32 bit
+would hash "``printf``" and find it matches ``BUCKETS[3]`` by taking the 32-bit
 hash value and modulo it by ``n_buckets``.  ``BUCKETS[3]`` contains "6" which
 is the index into the ``HASHES`` table.  We would then compare any consecutive
-32 bit hashes values in the ``HASHES`` array as long as the hashes would be in
+32-bit hash values in the ``HASHES`` array as long as the hashes would be in
 ``BUCKETS[3]``.  We do this by verifying that each subsequent hash value modulo
 ``n_buckets`` is still 3.  In the case of a failed lookup we would access the
-memory for ``BUCKETS[3]``, and then compare a few consecutive 32 bit hashes
+memory for ``BUCKETS[3]``, and then compare a few consecutive 32-bit hashes
 before we know that we have no match.  We don't end up marching through
 multiple words of memory and we really keep the number of processor data cache
 lines being accessed as small as possible.
@@ -1842,10 +1842,10 @@ header is:
     HeaderData header_data;     // Implementation specific header data
   };
 
-The header starts with a 32 bit "``magic``" value which must be ``'HASH'``
+The header starts with a 32-bit "``magic``" value which must be ``'HASH'``
 encoded as an ASCII integer.  This allows the detection of the start of the
 hash table and also allows the table's byte order to be determined so the table
-can be correctly extracted.  The "``magic``" value is followed by a 16 bit
+can be correctly extracted.  The "``magic``" value is followed by a 16-bit
 ``version`` number which allows the table to be revised and modified in the
 future.  The current version number is 1. ``hash_function`` is a ``uint16_t``
 enumeration that specifies which hash function was used to produce this table.
@@ -1858,8 +1858,8 @@ The current values for the hash function enumerations include:
     eHashFunctionDJB = 0u, // Daniel J Bernstein hash function
   };
 
-``bucket_count`` is a 32 bit unsigned integer that represents how many buckets
-are in the ``BUCKETS`` array.  ``hashes_count`` is the number of unique 32 bit
+``bucket_count`` is a 32-bit unsigned integer that represents how many buckets
+are in the ``BUCKETS`` array.  ``hashes_count`` is the number of unique 32-bit
 hash values that are in the ``HASHES`` array, and is the same number of offsets
 are contained in the ``OFFSETS`` array.  ``header_data_len`` specifies the size
 in bytes of the ``HeaderData`` that is filled in by specialized versions of
@@ -1875,12 +1875,12 @@ The header is followed by the buckets, hashes, offsets, and hash value data.
   struct FixedTable
   {
     uint32_t buckets[Header.bucket_count];  // An array of hash indexes into the "hashes[]" array below
-    uint32_t hashes [Header.hashes_count];  // Every unique 32 bit hash for the entire table is in this table
+    uint32_t hashes [Header.hashes_count];  // Every unique 32-bit hash for the entire table is in this table
     uint32_t offsets[Header.hashes_count];  // An offset that corresponds to each item in the "hashes[]" array above
   };
 
-``buckets`` is an array of 32 bit indexes into the ``hashes`` array.  The
-``hashes`` array contains all of the 32 bit hash values for all names in the
+``buckets`` is an array of 32-bit indexes into the ``hashes`` array.  The
+``hashes`` array contains all of the 32-bit hash values for all names in the
 hash table.  Each hash in the ``hashes`` table has an offset in the ``offsets``
 array that points to the data for the hash value.
 
@@ -1966,23 +1966,23 @@ array to be:
   HeaderData.atoms[0].type = eAtomTypeDIEOffset;
   HeaderData.atoms[0].form = DW_FORM_data4;
 
-This defines the contents to be the DIE offset (eAtomTypeDIEOffset) that is
-encoded as a 32 bit value (DW_FORM_data4).  This allows a single name to have
+This defines the contents to be the DIE offset (``eAtomTypeDIEOffset``) that is
+encoded as a 32-bit value (``DW_FORM_data4``).  This allows a single name to have
 multiple matching DIEs in a single file, which could come up with an inlined
 function for instance.  Future tables could include more information about the
 DIE such as flags indicating if the DIE is a function, method, block,
 or inlined.
 
-The KeyType for the DWARF table is a 32 bit string table offset into the
+The KeyType for the DWARF table is a 32-bit string table offset into the
 ".debug_str" table.  The ".debug_str" is the string table for the DWARF which
 may already contain copies of all of the strings.  This helps make sure, with
 help from the compiler, that we reuse the strings between all of the DWARF
 sections and keeps the hash table size down.  Another benefit to having the
-compiler generate all strings as DW_FORM_strp in the debug info, is that
+compiler generate all strings as ``DW_FORM_strp`` in the debug info, is that
 DWARF parsing can be made much faster.
 
 After a lookup is made, we get an offset into the hash data.  The hash data
-needs to be able to deal with 32 bit hash collisions, so the chunk of data
+needs to be able to deal with 32-bit hash collisions, so the chunk of data
 at the offset in the hash data consists of a triple:
 
 .. code-block:: c
@@ -1992,7 +1992,7 @@ at the offset in the hash data consists of a triple:
   HashData[hash_data_count]
 
 If "str_offset" is zero, then the bucket contents are done. 99.9% of the
-hash data chunks contain a single item (no 32 bit hash collision):
+hash data chunks contain a single item (no 32-bit hash collision):
 
 .. code-block:: none
 
@@ -2025,7 +2025,7 @@ If there are collisions, you will have multiple valid string offsets:
   `------------'
 
 Current testing with real world C++ binaries has shown that there is around 1
-32 bit hash collision per 100,000 name entries.
+32-bit hash collision per 100,000 name entries.
 
 Contents
 ^^^^^^^^
@@ -2114,7 +2114,7 @@ We get a few type DIEs:
                   AT_type( {0x00000067} ( int ) )
                   AT_byte_size( 0x08 )
 
-The DW_TAG_pointer_type is not included because it does not have a ``DW_AT_name``.
+The ``DW_TAG_pointer_type`` is not included because it does not have a ``DW_AT_name``.
 
 "``.apple_namespaces``" section should contain all ``DW_TAG_namespace`` DIEs.
 If we run into a namespace that has no name this is an anonymous namespace, and
diff --git a/llvm/docs/TestingGuide.rst b/llvm/docs/TestingGuide.rst
index b1819c7..f79d7ea 100644
--- a/llvm/docs/TestingGuide.rst
+++ b/llvm/docs/TestingGuide.rst
@@ -834,7 +834,7 @@ Besides replacing LLVM tool names, the following substitutions are performed in
    Invokes the Clang driver.
 
 ``%clang_cpp``
-   Invokes the Clang driver for C++.
+   Invokes the Clang driver as the preprocessor.
 
 ``%clang_cl``
    Invokes the CL-compatible Clang driver.