Age | Commit message (Collapse) | Author | Files | Lines |
|
Try to remove `UnsafeFPMath` uses in arm backend. These global flags
block some improvements like
https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797.
Remove them incrementally.
|
|
declarations. (#160749)
This code doesn't work very well, but this makes it work when intrinsic
definitions are present. It now discounts functions declarations from
the set of attributes it looks at.
The code would have worked better before
0ab5b5b8581d9f2951575f7245824e6e4fc57dec when module-level attributes
could provide the information used to construct build-attributes.
|
|
SMLoc itself encapsulates just a pointer, so there is no need to pass or
return it by reference.
|
|
An inline asm constraint "Jr", in AArch32, means that if the input value
is a compile-time constant in the range -4095 to +4095, then it can be
inserted into the assembly language as an immediate operand, and
otherwise it will be placed in a register.
The comment in the Arm backend said "It is not clear what this
constraint is intended for". I believe the answer is that that range of
immediate values are the ones you can use in a LDR or STR instruction.
So it's suitable for cases like this:
asm("str %0,[%1,%2]" : : "r"(data), "r"(base), "Jr"(offset) : "memory");
in the same way that the "Ir" constraint is suitable for the immediate
in a data-processing instruction such as ADD or EOR.
|
|
Factor out from #151275
Remove all UnsafeFPMath uses but ABI tags related part.
|
|
Some LLVM passes need access to the filesystem to read configuration
files and similar. In some places, this is achieved by grabbing the VFS
from `PGOOptions`, but some passes don't have access to these and resort
to just calling `vfs::getRealFileSystem()`. This PR allows setting the
VFS directly on `PassBuilder` that's able to pass it down to all passes
that need it.
|
|
ucmp (#159889)
Same deal we use for determining ucmp vs scmp.
Using selects on platforms that like selects is better than using usubo.
Rename function to be more general fitting this new description.
|
|
Factor out from #151275.
Add denormal mode to subtarget.
|
|
.. to isReMaterializableImpl. The "Really" naming has always been
awkward, and we're working towards removing the "Trivial" part now,
so go ehead and remove both pieces in a single rename.
Note that this doesn't change any aspect of the current
implementation; we still "mostly" only return instructions which
are trivial (meaning no virtual register uses), but some targets
do lie about that today.
|
|
(#159778)
Extract error reporting code emitted by CodeEmitterGen into
MCCodeEmitter static members functions.
Additionally, remove unused ErrorHandling.h header from several files.
|
|
Move `DecodeT2AddrModeImm8` and `DecodeT2Imm8` definition before its
first use and eliminate the last remaining forward declarations of
decode functions.
Work on https://github.com/llvm/llvm-project/issues/156560 : Reorder ARM
disassembler decode functions to eliminate forward declarations
|
|
The operand can be decoded automatically, without the need for
post-decoding instruction modification.
Part of #156540.
|
|
Just do a custom lowering instead.
Also copy paste the cmov-neg fold to prevent regressions in nabs.
|
|
This change adds basic `MCInst` verification (checks the number of
operands) and fixes detected bugs.
* `RFE*` instructions have only one operand, but `DecodeRFEInstruction`
added two.
* `DecodeMVEModImmInstruction` and `DecodeMVEVCMP` added a `vpred`
operand, but this is what `AddThumbPredicate` normally does. This
resulted in an extra `vpred` operand.
* `DecodeMVEVADCInstruction` added an extra immediate operand.
* `getARMInstruction` added a `pred` operand to instructions that don't
have one (via `DecodePredicateOperand`).
* `AddThumb1SBit` appended an extra register operand to instructions
that don't modify CPSR (such as `tBL`).
* Instructions in `NEONDup` namespace have `pred` operand that the
generated code successfully decodes. The operand was added once again by
`getARMInstruction`/`getThumbInstruction` via `AddThumbPredicate`.
Functional changes extracted from #156540.
|
|
In this commit:
(1) Added new pass manager support for `ReachingDefAnalysis`.
(2) Added printer pass.
(3) Make old pass manager use `ReachingDefInfoWrapperPass`
|
|
This is a low level utility to parse the MCInstrInfo and should
not depend on the state of the function.
|
|
getPointerRegClass is a layering violation. Its primary purpose
is to determine how to interpret an MCInstrDesc's operands RegClass
fields. This should be context free, and only depend on the subtarget.
The model of this is also wrong, since this should be an
instruction / operand specific property, not a global pointer class.
Remove the the function argument to help stage removal of this hook
and avoid introducing any new obstacles to replacing it.
The remaining uses of the function were to get the subtarget, which
TargetRegisterInfo already belongs to. A few targets needed new
subtarget derived properties copied there.
|
|
Clang and other frontends generally need the LLVM data layout string in
order to generate LLVM IR modules for LLVM. MLIR clients often need it
as well, since MLIR users often lower to LLVM IR.
Before this change, the LLVM datalayout string was computed in the
LLVM${TGT}CodeGen library in the relevant TargetMachine subclass.
However, none of the logic for computing the data layout string requires
any details of code generation. Clients who want to avoid duplicating
this information were forced to link in LLVMCodeGen and all registered
targets, leading to bloated binaries. This happened in PR #145899,
which measurably increased binary size for some of our users.
By moving this information to the TargetParser library, we
can delete the duplicate datalayout strings in Clang, and retain the
ability to generate IR for unregistered targets.
This is intended to be a very mechanical LLVM-only change, but there is
an immediately obvious follow-up to clang, which will be prepared
separately.
The vast majority of data layouts are computable with two inputs: the
triple and the "ABI name". There is only one exception, NVPTX, which has
a cl::opt to enable short device pointers. I invented a "shortptr" ABI
name to pass this option through the target independent interface.
Everything else fits. Mips is a bit awkward because it uses a special
MipsABIInfo abstraction, which includes members with codegen-like
concepts like ABI physical registers that can't live in TargetParser. I
think the string logic of looking for "n32" "n64" etc is reasonable to
duplicate. We have plenty of other minor duplication to preserve
layering.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
|
|
Fix a regression from https://github.com/llvm/llvm-project/pull/147559.
|
|
Move all decode functions (except `DecodeT2AddrModeImm8`) that had
forward declarations around so that they are defined before their first
use and not need a forward declaration.
Work on https://github.com/llvm/llvm-project/issues/156560 : Reorder ARM
disassembler decode functions to eliminate forward declarations
|
|
This will make it possible for tablegen to make subtarget
dependent decisions without adding new arguments to every
target.
---------
Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
|
|
Factor out from #151275.
|
|
nodes can't create poison/undef (#156831)
### Summary
This PR resolves https://github.com/llvm/llvm-project/issues/156640
|
|
There are two classes of operands that DecoderEmitter cannot currently
handle:
1. Operands that do not participate in instruction encoding.
2. Operands whose encoding contains only 1s and 0s.
Because of this, targets developed various workarounds. Some targets
insert missing operands after an instruction has been (incompletely)
decoded, other take into account the missing operands when printing the
instruction. Some targets do neither of that and fail to correctly
disassemble some instructions.
This patch makes it possible to decode both classes of operands and
allows to remove existing workarounds.
For the case of operand with no contribution to instruction encoding,
one should now add `bits<0> OpName` field to instruction encoding
record. This will make DecoderEmitter generate a call to the decoder
function specified by the operand's DecoderMethod. The function has a
signature different from the usual one and looks like this:
```
static DecodeStatus DecodeImm42Operand(MCInst &Inst, const MCDisassembler *Decoder) {
Inst.addOperand(MCOperand::createImm(42));
return DecodeStatus::Success;
}
```
Notably, encoding bits are not passed to it (since there are none).
There is nothing special about the second case, the operand bits are
passed as usual. The difference is that before this change, the function
was not called if all the bits of the operand were known (no '?' in the
operand encoding).
There are two options controlling the behavior. Passing an option
enables the old behavior. They exist to allow smooth transition to the
new behavior. They are temporary (yeah, I know) and will be removed once
all targets migrate, possibly giving some more time to downstream
targets.
Subsequent patches in the stack enable the new behavior on some in-tree
targets.
|
|
|
|
VORRIMM\VBICIMM nodes (#149494)
### Summary
This PR resolves https://github.com/llvm/llvm-project/issues/147179
|
|
This hook replaces inline asm with LLVM intrinsics. It was intended to
match inline assembly implementations of bswap in libc headers and
replace them more optimizable implementations.
At this point, it has outlived its usefulness (see
https://github.com/llvm/llvm-project/issues/156571#issuecomment-3247638412),
as libc implementations no longer use inline assembly for this purpose.
Additionally, it breaks the "black box" property of inline assembly,
which some languages like Rust would like to guarantee.
Fixes https://github.com/llvm/llvm-project/issues/156571.
|
|
(#156212)
|
|
|
|
As noted in #153256, TableGen is generating reserved names for
RuntimeLibcalls, which resulted in a build failure for Arm64EC since
`vcruntime.h` defines `__security_check_cookie` as a macro.
To avoid using reserved names, all impl names will now be prefixed with
`Impl_`.
`NumLibcallImpls` was lifted out as a `constexpr size_t` instead of
being an enum field.
While I was churning the dependent code, I also removed the TODO to move
the impl enum into its own namespace and use an `enum class`: I
experimented with using an `enum class` and adding a namespace, but we
decided it was too verbose so it was dropped.
|
|
This PR bundles sub reductions into the VPExpressionRecipe class and
adjusts the cost functions to take the negation into account.
Stacked PRs:
1. https://github.com/llvm/llvm-project/pull/147026
2. -> https://github.com/llvm/llvm-project/pull/147255
3. https://github.com/llvm/llvm-project/pull/147302
4. https://github.com/llvm/llvm-project/pull/147513
|
|
mode. (#156208)
When using no-movt we don't use the pcrel version of the literal load.
This change also unifies logic with the ARM version of this function as
well,
which has:
```
if (!Subtarget.useMovt() || ForceELFGOTPIC) {
// For ELF non-PIC, use GOT PIC code sequence as well because R_ARM_GOT_ABS
// does not have assembler support.
if (TM.isPositionIndependent() || ForceELFGOTPIC)
expandLoadStackGuardBase(MI, ARM::LDRLIT_ga_pcrel, ARM::LDRi12);
else
expandLoadStackGuardBase(MI, ARM::LDRLIT_ga_abs, ARM::LDRi12);
return;
}
```
rdar://138334512
|
|
Pass the opcode directly.
|
|
Move some of the non-static-decode functions to the end of the file.
Note: moving `ARMDisassembler::AddThumbPredicate` the same way causes
the diff to be non-trivial, so not doing that here.
|
|
If a custom operand has MIOperandInfo with >= 2 sub-operands, it is
required that either the operand or its sub-operands have a decoder
method (depending on usage). Require this for single sub-operand
operands as well, since there is no good reason not to.
There are no changes in the generated files.
|
|
getInstrInfo() already returns const ARMBaseInstrInfo *.
|
|
(#136411)
Currently, complex operands of an instruction are flattened in the resulting DAG of `InstAlias`.
This change makes it required to specify complex operands in `InstAlias` as sub-DAGs:
```
InstAlias<"foo $rd, $rs1, $rs2", (Inst RC:$rd, (ComplexOp RC:$rs1, GR0, 42), SimpleOp:$rs2)>;
```
instead of
```
InstAlias<"foo $rd, $rs1, $rs2", (Inst RC:$rd, RC:$rs1, GR0, 42, SimpleOp:$rs2)>;
```
The advantages of the new syntax are improved readability and more robust type checking, although it is a bit more verbose.
|
|
getSUnit() already returns SUnit *.
|
|
segmented stores (#154647)
This is a sibling patch to #151612: passing gap masks to the renewal TLI
hooks for lowering interleaved stores that use shufflevector to do the
interleaving.
|
|
Move `tryAddingSymbolicOperand` and `tryAddingPcLoadReferenceComment` to
before including the generated disassembler code. This is in preparation
for rearranging the decoder functions to eliminate forward declarations.
|
|
This is so that we don't expand to include unneeded 0 checks.
Also fix the logic error in LegalizerInfo so it is NOT legal on Thumb1
in Fast-ISEL.
Finally, Remove the README entry regarding this issue.
|
|
getType() already returns Type *.
|
|
A call sequence does not have any unmodeled side effects in of itself.
ADJCALLSTACKUP and ADJCALLSTACKDOWN do, however, so the attribute should
be there.
|
|
Add entries for_stack_chk_guard, __ssp_canary_word, __security_cookie,
and __guard_local. As far as I can tell these are all just different
names for the same shaped functionality on different systems.
These aren't really functions, but special global variable names. They
should probably be treated the same way; all the same contexts that
need to know about emittable function names also need to know about
this. This avoids a special case check in IRSymtab.
This isn't a complete change, there's a lot more cleanup which
should be done. The stack protector configuration system is a
complete mess. There are multiple overlapping controls, used in
3 different places. Some of the target control implementations overlap
with conditions used in the emission points, and some use correlated
but not identical conditions in different contexts.
i.e. useLoadStackGuardNode, getIRStackGuard, getSSPStackGuardCheck and
insertSSPDeclarations are all used in inconsistent ways so I don't know
if I've tracked the intention of the system correctly.
The PowerPC test change is a bug fix on linux. Previously the manual
conditions were based around !isOSOpenBSD, which is not the condition
where __stack_chk_guard are used. Now getSDagStackGuard returns the
proper global reference, resulting in LOAD_STACK_GUARD getting a
MachineMemOperand which allows scheduling.
|
|
We just replaced SmallSet<T *, N> with SmallPtrSet<T *, N>, bypassing
the redirection found in SmallSet.h. With that, we no longer need to
include SmallSet.h in many files.
|
|
This is a weird special case added in 2015, simplifying an even older
condition. It is a no-op for ELF (isExternal is always false) and seems
unneeded for non-ELF.
|
|
This bit is only used by COFF/MachO. The upcoming change will move
isExported/setExported to MCSymbolCOFF/MCSymbolMachO.
|
|
(#154802)
Extract fixed functions generated by decoder emitter into a new
MCDecoder.h header.
|
|
|
|
When an instruction that the disassembler does not recognize is in an IT
block, we should still advance the IT state otherwise the IT state
spills over into the next recognized instruction, which is incorrect.
We want to avoid disassembly like:
it eq
<unknown> // Often because disassembler has insufficient target info.
addeq r0,r0,r0 // eq spills over into add.
Fixes #150569
|