diff options
Diffstat (limited to 'llvm/docs')
-rw-r--r-- | llvm/docs/AArch64SME.rst | 24 | ||||
-rw-r--r-- | llvm/docs/HowToBuildOnARM.rst | 18 | ||||
-rw-r--r-- | llvm/docs/LangRef.rst | 18 | ||||
-rw-r--r-- | llvm/docs/MergeFunctions.rst | 46 | ||||
-rw-r--r-- | llvm/docs/ReleaseNotes.md | 1 |
5 files changed, 54 insertions, 53 deletions
diff --git a/llvm/docs/AArch64SME.rst b/llvm/docs/AArch64SME.rst index 47ed7bc..327f9dc 100644 --- a/llvm/docs/AArch64SME.rst +++ b/llvm/docs/AArch64SME.rst @@ -124,7 +124,7 @@ In this table, we use the following abbreviations: either 0 or 1 on entry, and is unchanged on return). Functions with ``__attribute__((arm_locally_streaming))`` are excluded from this -table because for the caller the attribute is synonymous to 'streaming', and +table because for the caller the attribute is synonymous with 'streaming', and for the callee it is merely an implementation detail that is explicitly not exposed to the caller. @@ -158,7 +158,7 @@ the function's body, so that it can place the mode changes in exactly the right position. The suitable place to do this seems to be SelectionDAG, where it lowers the call's arguments/return values to implement the specified calling convention. SelectionDAG provides Chains and Glue to specify the order of operations and give -preliminary control over the instruction's scheduling. +preliminary control over instruction scheduling. Example of preserving state @@ -232,8 +232,8 @@ implement transitions from ``SC -> N`` and ``SC -> S``. Unchained Function calls ------------------------ When a function with "``aarch64_pstate_sm_enabled``" calls a function that is not -streaming compatible, the compiler has to insert a SMSTOP before the call and -insert a SMSTOP after the call. +streaming compatible, the compiler has to insert an SMSTOP before the call and +insert an SMSTOP after the call. If the function that is called is an intrinsic with no side-effects which in turn is lowered to a function call (e.g., ``@llvm.cos()``), then the call to @@ -388,7 +388,7 @@ The value of PSTATE.SM is not controlled by the feature flags, but rather by the function attributes. This means that we can compile for '``+sme``', and the compiler will code-generate any instructions, even if they are not legal under the requested streaming mode. The compiler needs to use the function attributes to ensure the -compiler doesn't do transformations under the assumption that certain operations +compiler doesn't perform transformations under the assumption that certain operations are available at runtime. We made a conscious choice not to model this with feature flags because we @@ -399,11 +399,11 @@ and `D121208 <https://reviews.llvm.org/D121208>`_) because of limitations in TableGen. As a first step, this means we'll disable vectorization (LoopVectorize/SLP) -entirely when the a function has either of the ``aarch64_pstate_sm_enabled``, +entirely when a function has either of the ``aarch64_pstate_sm_enabled``, ``aarch64_pstate_sm_body`` or ``aarch64_pstate_sm_compatible`` attributes, in order to avoid the use of vector instructions. -Later on we'll aim to relax these restrictions to enable scalable +Later on, we'll aim to relax these restrictions to enable scalable auto-vectorization with a subset of streaming-compatible instructions, but that requires changes to the CostModel, Legalization and SelectionDAG lowering. @@ -416,7 +416,7 @@ Other things to consider ------------------------ * Inlining must be disabled when the call-site needs to toggle PSTATE.SM or - when the callee's function body is executed in a different streaming mode than + when the callee's function body is executed in a different streaming mode from its caller. This is needed because function calls are the boundaries for streaming mode changes. @@ -434,8 +434,8 @@ lazy-save mechanism for calls to private-ZA functions (i.e. functions that may either directly or indirectly clobber ZA state). For the purpose of handling functions marked with ``aarch64_new_za``, -we have introduced a new LLVM IR pass (SMEABIPass) that is run just before -SelectionDAG. Any such functions dealt with by this pass are marked with +we have introduced a new LLVM IR pass (SMEABIPass) that runs just before +SelectionDAG. Any such functions handled by this pass are marked with ``aarch64_expanded_pstate_za``. Setting up a lazy-save @@ -458,7 +458,7 @@ AArch64 Predicate-as-Counter Type The predicate-as-counter type represents the type of a predicate-as-counter value held in an AArch64 SVE predicate register. Such a value contains information about the number of active lanes, the element width and a bit that -tells whether the generated mask should be inverted. ACLE intrinsics should be +indicates whether the generated mask should be inverted. ACLE intrinsics should be used to move the predicate-as-counter value to/from a predicate vector. There are certain limitations on the type: @@ -466,7 +466,7 @@ There are certain limitations on the type: * The type can be used for function parameters and return values. * The supported LLVM operations on this type are limited to ``load``, ``store``, - ``phi``, ``select`` and ``alloca`` instructions. + ``phi``, ``select``, and ``alloca`` instructions. The predicate-as-counter type is a scalable type. diff --git a/llvm/docs/HowToBuildOnARM.rst b/llvm/docs/HowToBuildOnARM.rst index 9eb6b5a..30e3744 100644 --- a/llvm/docs/HowToBuildOnARM.rst +++ b/llvm/docs/HowToBuildOnARM.rst @@ -23,10 +23,10 @@ on the ARMv6 and ARMv7 architectures and may be inapplicable to older chips. choices when using CMake. Autoconf usage is deprecated as of 3.8. Building LLVM/Clang in ``Release`` mode is preferred since it consumes - a lot less memory. Otherwise, the building process will very likely + a lot less memory. Otherwise, the build process will very likely fail due to insufficient memory. It's also a lot quicker to only build the relevant back-ends (ARM and AArch64), since it's very unlikely that - you'll use an ARM board to cross-compile to other arches. If you're + you'll use an ARM board to cross-compile to other architectures. If you're running Compiler-RT tests, also include the x86 back-end, or some tests will fail. @@ -48,15 +48,15 @@ on the ARMv6 and ARMv7 architectures and may be inapplicable to older chips. ``make -jN check-all`` or ``ninja check-all`` will run all compiler tests. For running the test suite, please refer to :doc:`TestingGuide`. -#. If you are building LLVM/Clang on an ARM board with 1G of memory or less, - please use ``gold`` rather then GNU ``ld``. In any case it is probably a good +#. If you are building LLVM/Clang on an ARM board with 1 GB of memory or less, + please use ``gold`` rather than GNU ``ld``. In any case, it is probably a good idea to set up a swap partition, too. .. code-block:: bash $ sudo ln -sf /usr/bin/ld /usr/bin/ld.gold -#. ARM development boards can be unstable and you may experience that cores +#. ARM development boards can be unstable, and you may experience that cores are disappearing, caches being flushed on every big.LITTLE switch, and other similar issues. To help ease the effect of this, set the Linux scheduler to "performance" on **all** cores using this little script: @@ -73,12 +73,12 @@ on the ARMv6 and ARMv7 architectures and may be inapplicable to older chips. problems. #. Running the build on SD cards is ok, but they are more prone to failures - than good quality USB sticks, and those are more prone to failures than - external hard-drives (those are also a lot faster). So, at least, you + than good-quality USB sticks, and those are more prone to failures than + external hard drives (those are also a lot faster). So, at least, you should consider to buy a fast USB stick. On systems with a fast eMMC, that's a good option too. #. Make sure you have a decent power supply (dozens of dollars worth) that can - provide *at least* 4 amperes, this is especially important if you use USB - devices with your board. Externally powered USB/SATA harddrives are even + provide *at least* 4 amperes. This is especially important if you use USB + devices with your board. Externally powered USB/SATA hard drives are even better than having a good power supply. diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 8b6c25c..4884e2d 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -21074,12 +21074,12 @@ Overview: The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>`` matrix using a stride of ``%Stride`` to compute the start address of the -different columns. The offset is computed using ``%Stride``'s bitwidth. This -allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the -intrinsic is considered a :ref:`volatile memory access <volatile>`. The result -matrix is returned in the result vector. If the ``%Ptr`` argument is known to -be aligned to some boundary, this can be specified as an attribute on the -argument. +different columns. This allows for convenient loading of sub matrixes. +Independent of ``%Stride``'s bitwidth, the offset is computed using the target +daya layout's pointer index type. If ``<IsVolatile>`` is true, the intrinsic is +considered a :ref:`volatile memory access <volatile>`. The result matrix is +returned in the result vector. If the ``%Ptr`` argument is known to be aligned +to some boundary, this can be specified as an attribute on the argument. Arguments: """""""""" @@ -21114,9 +21114,9 @@ Overview: The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x <Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between -columns. The offset is computed using ``%Stride``'s bitwidth. If -``<IsVolatile>`` is true, the intrinsic is considered a -:ref:`volatile memory access <volatile>`. +columns. Independent of ``%Stride``'s bitwidth, the offset is computed using +the target daya layout's pointer index type. If ``<IsVolatile>`` is true, the +intrinsic is considered a :ref:`volatile memory access <volatile>`. If the ``%Ptr`` argument is known to be aligned to some boundary, this can be specified as an attribute on the argument. diff --git a/llvm/docs/MergeFunctions.rst b/llvm/docs/MergeFunctions.rst index c27f603..d43b9c3 100644 --- a/llvm/docs/MergeFunctions.rst +++ b/llvm/docs/MergeFunctions.rst @@ -9,13 +9,13 @@ Introduction ============ Sometimes code contains equal functions, or functions that do exactly the same thing even though they are non-equal on the IR level (e.g.: multiplication on 2 -and 'shl 1'). It could happen due to several reasons: mainly, the usage of +and 'shl 1'). This can happen for several reasons: mainly, the usage of templates and automatic code generators. Though, sometimes the user itself could write the same thing twice :-) The main purpose of this pass is to recognize such functions and merge them. -This document is the extension to pass comments and describes the pass logic. It +This document is an extension to pass comments and describes the pass logic. It describes the algorithm used to compare functions and explains how we could combine equal functions correctly to keep the module valid. @@ -54,7 +54,7 @@ As a good starting point, the Kaleidoscope tutorial can be used: :doc:`tutorial/index` -It's especially important to understand chapter 3 of tutorial: +It's especially important to understand Chapter 3 of the tutorial: :doc:`tutorial/LangImpl03` @@ -314,7 +314,7 @@ list is immaterial. Our walk starts at the entry block for both functions, then takes each block from each terminator in order. As an artifact, this also means that unreachable blocks are ignored.” -So, using this walk we get BBs from *left* and *right* in the same order, and +So, using this walk, we get BBs from *left* and *right* in the same order, and compare them by “``FunctionComparator::compare(const BasicBlock*, const BasicBlock*)``” method. @@ -325,17 +325,17 @@ FunctionComparator::cmpType --------------------------- Consider how type comparison works. -1. Coerce pointer to integer. If left type is a pointer, try to coerce it to the +1. Coerce pointer to integer. If the left type is a pointer, try to coerce it to the integer type. It could be done if its address space is 0, or if address spaces are ignored at all. Do the same thing for the right type. -2. If left and right types are equal, return 0. Otherwise we need to give +2. If the left and right types are equal, return 0. Otherwise, we need to give preference to one of them. So proceed to the next step. -3. If types are of different kind (different type IDs). Return result of type +3. If the types are of different kind (different type IDs). Return result of type IDs comparison, treating them as numbers (use ``cmpNumbers`` operation). -4. If types are vectors or integers, return result of their pointers comparison, +4. If the types are vectors or integers, return result of their pointers comparison, comparing them as numbers. 5. Check whether type ID belongs to the next group (call it equivalent-group): @@ -391,7 +391,7 @@ equal to the corresponding part of *right* place, and (!) both parts use So, now our conclusion depends on *Value* instances comparison. -The main purpose of this method is to determine relation between such values. +The main purpose of this method is to determine the relation between such values. What can we expect from equal functions? At the same place, in functions "*FL*" and "*FR*" we expect to see *equal* values, or values *defined* at @@ -453,17 +453,17 @@ maps (one for the left side, another one for the right side): ``map<Value, int> sn_mapL, sn_mapR;`` -The key of the map is the *Value* itself, the *value* – is its order (call it +The key of the map is the *Value* itself; the *value* – is its order (call it *serial number*). To add value *V* we need to perform the next procedure: ``sn_map.insert(std::make_pair(V, sn_map.size()));`` -For the first *Value*, map will return *0*, for the second *Value* map will +For the first *Value*, the map will return *0*, for the second *Value*, the map will return *1*, and so on. -We can then check whether left and right values met at the same time with +We can then check whether the left and right values met at the same time with a simple comparison: ``cmpNumbers(sn_mapL[Left], sn_mapR[Right]);`` @@ -525,7 +525,7 @@ and finish comparison procedure. cmpConstants ------------ -Performs constants comparison as follows: +Performs a constant comparison as follows: 1. Compare constant types using ``cmpType`` method. If the result is -1 or 1, goto step 2, otherwise proceed to step 3. @@ -655,10 +655,10 @@ O(N*N) to O(log(N)). Merging process, mergeTwoFunctions ================================== Once *MergeFunctions* detects that current function (*G*) is equal to one that -were analyzed before (function *F*) it calls ``mergeTwoFunctions(Function*, +was analyzed before (function *F*) it calls ``mergeTwoFunctions(Function*, Function*)``. -Operation affects ``FnTree`` contents with next way: *F* will stay in +Operation affects ``FnTree`` contents in the following way: *F* will stay in ``FnTree``. *G* being equal to *F* will not be added to ``FnTree``. Calls of *G* would be replaced with something else. It changes bodies of callers. So, functions that calls *G* would be put into ``Deferred`` set and removed from @@ -692,8 +692,8 @@ ok: we can use alias to *F* instead of *G* or change call instructions itself. HasGlobalAliases, removeUsers ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ First, consider the case when we have global aliases of one function name to -another. Our purpose is make both of them with aliases to the third strong -function. Though if we keep *F* alive and without major changes we can leave it +another. Our purpose is to make both of them with aliases to the third strong +function. However, if we keep *F* alive and without major changes, we can leave it in ``FnTree``. Try to combine these two goals. Do a stub replacement of *F* itself with an alias to *F*. @@ -701,10 +701,10 @@ Do a stub replacement of *F* itself with an alias to *F*. 1. Create stub function *H*, with the same name and attributes like function *F*. It takes maximum alignment of *F* and *G*. -2. Replace all uses of function *F* with uses of function *H*. It is the two -steps procedure instead. First of all, we must take into account, all functions -from whom *F* is called would be changed: since we change the call argument -(from *F* to *H*). If so we must to review these caller functions again after +2. Replace all uses of function *F* with uses of function *H*. It is a +two-step procedure instead. First of all, we must take into account that all functions +that call *F* would be changed because we change the call argument +(from *F* to *H*). If so, we must review these caller functions again after this procedure. We remove callers from ``FnTree``, method with name ``removeUsers(F)`` does that (don't confuse with ``replaceAllUsesWith``): @@ -735,7 +735,7 @@ If “F” could not be overridden, fix it! """"""""""""""""""""""""""""""""""""""" We call ``writeThunkOrAlias(Function *F, Function *G)``. Here we try to replace -*G* with alias to *F* first. The next conditions are essential: +*G* with an alias to *F* first. The next conditions are essential: * target should support global aliases, * the address itself of *G* should be not significant, not named and not @@ -775,7 +775,7 @@ with bitcast(F). Deletes G.” In general it does the same as usual when we want to replace callee, except the first point: -1. We generate tail call wrapper around *F*, but with interface that allows use +1. We generate tail call wrapper around *F*, but with an interface that allows using it instead of *G*. 2. “As-usual”: ``removeUsers`` and ``replaceAllUsesWith`` then. diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md index 79d93d0..30aeccd 100644 --- a/llvm/docs/ReleaseNotes.md +++ b/llvm/docs/ReleaseNotes.md @@ -147,6 +147,7 @@ Changes to the C API -------------------- * Add `LLVMGetOrInsertFunction` to get or insert a function, replacing the combination of `LLVMGetNamedFunction` and `LLVMAddFunction`. +* Allow `LLVMGetVolatile` to work with any kind of Instruction. Changes to the CodeGen infrastructure ------------------------------------- |