Age | Commit message (Collapse) | Author | Files | Lines |
|
For targets that support xnack replay feature (gfx8+), the
multi-dword scalar loads shouldn't clobber any register that
holds the src address. The constraint version of the scalar
loads have the early clobber flag attached to the dst operand
to restrict RA from re-allocating any of the src regs for its
dst operand.
|
|
Consider the constrained multi-dword loads while merging
individual loads to a single multi-dword load.
|
|
|
|
Load from null is UB, load from pointer arg instead.
|
|
trailing dimensions. (#92934)
Generalizes `DropUnitDimFromElementwiseOps` to support inner unit
dimensions.
This change stems from improving lowering of contractionOps for Arm SME.
Where we end up with inner unit dimensions on MulOp, BroadcastOp and
TransposeOp, preventing the generation of outerproducts.
discussed
[here](https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543/17?u=nujaa).
---------
Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>
|
|
This re-uses reduction declarations from intrinsic operators to add
support for reductions of allocatables, pointers, and arrays with
procedure designators (e.g. min/max).
I have split this into two commits to make it easier to review. The
first one makes the functional change. The second cleans things up now
that we can share much more code between intrinsic operators and
procedure designators.
|
|
Previously, a symbol insertion requires (at least) three hash table
operations:
- Lookup/create entry in Symbols (main symbol table)
- Lookup NextUniqueID to deduplicate identical temporary labels
- Add entry to UsedNames, which is also used to serve as storage for the
symbol name in the MCSymbol.
All three lookups are done with the same name, so combining these into a
single table reduces the number of lookups to one. Thus, a pointer to a
symbol table entry can be passed to createSymbol to avoid a duplicate
lookup of the same name.
The new symbol table entry value is placed in a separate header to avoid
including MCContext in MCSymbol or vice versa.
|
|
|
|
Instruction-creation (#94226)
This patch simplifies instruction creation by replacing all overloads of
instruction constructors/Create methods that are identical other than
the Instruction *InsertBefore/BasicBlock *InsertAtEnd/BasicBlock::iterator
InsertBefore argument with a single version that takes an InsertPosition
argument. The InsertPosition class can be implicitly constructed from
any of the above, internally converting them to the appropriate
BasicBlock::iterator value which can then be used to insert the
instruction (or to not insert it if an invalid iterator is passed).
The upshot of this is that code will be deduplicated, and all callsites
will switch to calling the new unified version without any changes
needed to make the compiler happy. There is at least one exception to
this; the construction of InsertPosition is a user-defined conversion,
so any caller that was already relying on a different user-defined
conversion won't work. In all of LLVM and Clang this happens exactly
once: at clang/lib/CodeGen/CGExpr.cpp:123 we try to construct an alloca
with an AssertingVH<Instruction> argument, which must now be cast to an
Instruction* by using `&*`. If this is more common elsewhere, it could
be fixed by adding an appropriate constructor to InsertPosition.
|
|
Example:
```mlir
%mask = vector.create_mask %a, %b : vector<[4]x[8]xi1>
%slice = vector.extract %mask[%index]
: vector<[8]xi1> from vector<[4]x[8]xi1>
```
Becomes:
```mlir
%mask_rows = vector.create_mask %a : vector<[4]xi1>
%mask_cols = vector.create_mask %b : vector<[8]xi1>
%slice = arm_sve.psel %mask_cols, %mask_rows[%index]
: vector<[8]xi1>, vector<[4]xi1>
```
Note: While psel is under ArmSVE it requires SME (or SVE 2.1), so this
is currently the most logical place for this lowering.
|
|
|
|
(#89944)
The ABI mandates two things related to function calls:
- Function arguments must be sign- or zero-extended to the register
size by the caller.
- Return values must be sign- or zero-extended to the register size by
the callee.
As consequence, callees can assume that function arguments have been
extended and so can callers with regards to return values.
Here lies the problem: Nonsecure code might deliberately ignore this
mandate with the intent of attempting an exploit. It might try to pass
values that lie outside the expected type's value range in order to
trigger undefined behaviour, e.g. out of bounds access.
With the mitigation implemented, Secure code always performs extension
of values passed by Nonsecure code.
This addresses the vulnerability described in CVE-2024-0151.
Patches by Victor Campos.
---------
Co-authored-by: Victor Campos <victor.campos@arm.com>
|
|
|
|
|
|
|
|
|
|
Reverts the behavior introduced by 770393b while keeping the refactored
code.
Fixes a miscompile on AArch64, at the cost of a small regression on
AMDGPU.
#96146 opened to investigate the issue
|
|
There are only three actual uses of the section kind in MCSection:
isText(), XCOFF, and WebAssembly. Store isText() in the MCSection, and
store other info in the actual section variants where required.
ELF and COFF flags also encode all relevant information, so for these
two section variants, remove the SectionKind parameter entirely.
This allows to remove the string switch (which is unnecessary and
inaccurate) from createELFSectionImpl. This was introduced in
[D133456](https://reviews.llvm.org/D133456), but apparently, it was
never hit for non-writable sections anyway and the resulting kind was
never used.
|
|
This fixes https://github.com/llvm/llvm-project/issues/93309, and seems
to match how GNU ld handles this case.
|
|
Per P1975R0 an expression like static_cast<U[]>(...) defines the type of
the expression as U[1].
Fixes https://github.com/llvm/llvm-project/issues/62863
|
|
|
|
The initial check-in of compiler-rt/lib/nsan #94322 has a lot of style
issues. Fix them before the history becomes more useful.
Pull Request: https://github.com/llvm/llvm-project/pull/96142
|
|
Add missing includes.
|
|
|
|
Pattern scalarizes vector.gather operations and is incorrect for
scalable vectors.
|
|
The only special thing to do is to use fir.rebox_assumed_rank when
reboxing the target to properly set the POINTER attribute inside the
descriptor.
|
|
Just remove the TODO and add a test.
There is nothing special to do to deal with assumed-rank copy-in/out
after the previous copy-in/out API change in
https://github.com/llvm/llvm-project/pull/95822.
|
|
Remove support for shl constant expressions, as part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
|
|
Add same interfaces variants to the `Module::setModuleFlag` as the
`Module::addModuleFlag` has.
|
|
|
|
With simple template names the template arguments aren't embedded in the
DW_AT_name attribute of the type. The code in
FindDefinitionTypeForDWARFDeclContext was comparing the synthesized
template arguments on the leaf (most deeply nested) DIE, but was not
sufficient, as the difference get be at any level above that
(Foo<T>::Bar vs. Foo<U>::Bar). This patch makes sure we compare the
entire context.
As a drive-by I also remove the completely unnecessary
ConstStringification of the GetDIEClassTemplateParams result.
|
|
This patch reverts https://github.com/llvm/llvm-project/pull/81585 as
https://github.com/llvm/llvm-project/pull/78582 has been landed.
Now clang works well with reproducer
https://github.com/llvm/llvm-project/pull/79993#issuecomment-1936822679.
|
|
llvm::hash_value is not guaranteed to be deterministic. Use the
deterministic xxh3_64bits. A strong bit mixer isn't necessary. Use a
simpler one that works well with pointers.
|
|
|
|
Following of https://github.com/llvm/llvm-project/pull/92083
The motivation is still cutting of the unnecessary change in the
dependency chain. See the above link (recursively) for details.
After this patch, (and the above patch), we can already do something
pretty interesting. For example,
#### Motivation example
```
//--- m-partA.cppm
export module m:partA;
export inline int getA() {
return 43;
}
export class A {
public:
int getMem();
};
export template <typename T>
class ATempl {
public:
T getT();
};
//--- m-partA.v1.cppm
export module m:partA;
export inline int getA() {
return 43;
}
// Now we add a new declaration without introducing a new type.
// The consuming module which didn't use m:partA completely is expected to be
// not changed.
export inline int getA2() {
return 88;
}
export class A {
public:
int getMem();
// Now we add a new declaration without introducing a new type.
// The consuming module which didn't use m:partA completely is expected to be
// not changed.
int getMem2();
};
export template <typename T>
class ATempl {
public:
T getT();
// Add a new declaration without introducing a new type.
T getT2();
};
//--- m-partB.cppm
export module m:partB;
export inline int getB() {
return 430;
}
//--- m.cppm
export module m;
export import :partA;
export import :partB;
//--- useBOnly.cppm
export module useBOnly;
import m;
export inline int get() {
return getB();
}
```
In this example, module `m` exports two partitions `:partA` and
`:partB`. And a consumer `useBOnly` only consumes the entities from
`:partB`. So we don't hope the BMI of `useBOnly` changes if only
`:partA` changes. After this patch, we can make it if the change of
`:partA` doesn't introduce new types. (And we can get rid of this if we
make no-transitive-type-change).
As the example shows, when we change the implementation of `:partA` from
`m-partA.cppm` to `m-partA.v1.cppm`, we add new function declaration
`getA2()` at the global namespace, add a new member function `getMem2()`
to class `A` and add a new member function to `getT2()` to class
template `ATempl`. And since `:partA` is not used by `useBOnly`
completely, the BMI of `useBOnly` won't change after we made above
changes.
#### Design details
Method used in this patch is similar with
https://github.com/llvm/llvm-project/pull/92083 and
https://github.com/llvm/llvm-project/pull/86912. It extends the 32 bit
IdentifierID to 64 bits and use the higher 32 bits to store the module
file index. So that the encoding of the identifier won't get affected by
other modules.
#### Overhead
Similar with https://github.com/llvm/llvm-project/pull/92083 and
https://github.com/llvm/llvm-project/pull/86912. The change is only
expected to increase the size of the on-disk .pcm files and not affect
the compile-time performances. And from my experiment, the size of the
on-disk change only increase 1%+ and observe no compile-time impacts.
#### Future Plans
I'll try to do the same thing for type ids. IIRC, it won't change the
dependency graph if we add a new type in an unused units. I do think
this is a significant win. And this will be a pretty good answer to "why
modules are better than headers."
|
|
(#93481)
This change is a preliminary step to support trampolines on RISC-V. Trampolines are used by flang to implement obtaining the address of an internal program (i.e., a nested function in Fortran parlance).
In this change we lower `llvm.clear_cache` intrinsic on glibc targets to
`__riscv_flush_icache` which is what GCC is currently doing for Linux targets.
|
|
HeaderFileInfo
This patch is helpful to reduce 32 bits for HeaderFileInfo by combining
a uint32_t and pointer into a tagged pointer.
This is reviewed as part of
https://github.com/llvm/llvm-project/pull/92085 and required to be split
as a separate commit
|
|
|
|
For more info:
https://github.com/iree-org/iree/issues/17670#issuecomment-2167591878
|
|
(Background: See the comment:
https://github.com/llvm/llvm-project/pull/92083#issuecomment-2168121729)
It looks like the hash function for 64bits integers are not very good:
```
static unsigned getHashValue(const unsigned long long& Val) {
return (unsigned)(Val * 37ULL);
}
```
Since the result is truncated to 32 bits. It looks like the higher 32
bits won't contribute to the result. So that `0x1'00000001` will have
the the same results to `0x2'00000001`, `0x3'00000001`, ...
Then we may meet a lot collisions in such cases. I feel it should
generally good to include higher 32 bits for hashing functions.
Not sure who's the appropriate reviewer, adding some people by
impressions.
|
|
Non-root users may be able to set real-time scheduling policies. Don't
expect failure to set real-time scheduling policies based on UID.
Instead, check that if it fails, it is either due to missing privileges,
or unsupported parameters if the scheduling policy is not mandated by
POSIX.
Fixes #95564.
|
|
Otherwise LinkGraph::dump output could change
(llvm/test/ExecutionEngine/JITLink/x86-64/COFF_pdata_strip.s) when
llvm::hash_value(StringRef) changes.
|
|
Closes #95418.
|
|
|
|
Otherwise llvm/test/TableGen/GlobalISelCombinerEmitter/type-inference.td
could fail when llvm::hash_value(StringRef) changes.
Fix #66377
|
|
This change fixes the issue
https://github.com/llvm/llvm-project/issues/95977 due to commit
c0cba5198155dba246ddd5764f57595d9bbbddef inserting allocas after the
terminator op in the insertion block in the case where the block had
only a single operation, its terminator, in it. With this change, the
hoisted constant-sized allocas are placed at the front of the insertion
block, rather than right after the first operation in it.
|
|
Don't rely on the iteration order of DenseSet<StringRef>, which is not
guaranteed to be deterministic.
|
|
This diff contains the compiler-rt changes / preparations for nsan.
Test plan:
1. cd build/runtimes/runtimes-bins && ninja check-nsan
2. ninja check-all
|
|
Otherwise llvm/test/LTO/X86/cfi_jt_aliases.ll could fail when
DenseMapInfo<StringRef> changes.
|
|
--show-region-summary etc (#96016)
|