aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Target/ARM
AgeCommit message (Collapse)AuthorFilesLines
40 hours[ARM] Remove `UnsafeFPMath` uses (#151275)paperchalice1-2/+17
Try to remove `UnsafeFPMath` uses in arm backend. These global flags block some improvements like https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797. Remove them incrementally.
2 days[ARM] Generate build-attributes more correctly in the presence of intrinsic ↵David Green1-8/+11
declarations. (#160749) This code doesn't work very well, but this makes it work when intrinsic definitions are present. It now discounts functions declarations from the set of attributes it looks at. The code would have worked better before 0ab5b5b8581d9f2951575f7245824e6e4fc57dec when module-level attributes could provide the information used to construct build-attributes.
3 days[NFC][LLVM] Pass/return SMLoc by value instead of const reference (#160797)Rahul Joshi1-3/+3
SMLoc itself encapsulates just a pointer, so there is no need to pass or return it by reference.
4 days[ARM] Improve comment on the 'J' inline asm modifier. (#160712)Simon Tatham1-3/+3
An inline asm constraint "Jr", in AArch32, means that if the input value is a compile-time constant in the range -4095 to +4095, then it can be inserted into the assembly language as an immediate operand, and otherwise it will be placed in a register. The comment in the Arm backend said "It is not clear what this constraint is intended for". I believe the answer is that that range of immediate values are the ones you can use in a LDR or STR instruction. So it's suitable for cases like this: asm("str %0,[%1,%2]" : : "r"(data), "r"(base), "Jr"(offset) : "memory"); in the same way that the "Ir" constraint is suitable for the immediate in a data-processing instruction such as ADD or EOR.
4 days[ARM] Remove `UnsafeFPMath` uses in code generation part (#160801)paperchalice2-6/+5
Factor out from #151275 Remove all UnsafeFPMath uses but ABI tags related part.
4 days[llvm] Add `vfs::FileSystem` to `PassBuilder` (#160188)Jan Svoboda1-0/+2
Some LLVM passes need access to the filesystem to read configuration files and similar. In some places, this is achieved by grabbing the VFS from `PGOOptions`, but some passes don't have access to these and resort to just calling `vfs::getRealFileSystem()`. This PR allows setting the VFS directly on `PassBuilder` that's able to pass it down to all passes that need it.
5 days[TargetLowering][ExpandABD] Prefer selects over usubo if we do the same for ↵AZero132-2/+2
ucmp (#159889) Same deal we use for determining ucmp vs scmp. Using selects on platforms that like selects is better than using usubo. Rename function to be more general fitting this new description.
5 days[ARM] Consider denormal mode in `ARMSubtarget` (#160456)paperchalice3-10/+19
Factor out from #151275. Add denormal mode to subtarget.
6 days[CodeGen] Rename isReallyTriviallyReMaterializable [nfc]Philip Reames2-3/+3
.. to isReMaterializableImpl. The "Really" naming has always been awkward, and we're working towards removing the "Trivial" part now, so go ehead and remove both pieces in a single rename. Note that this doesn't change any aspect of the current implementation; we still "mostly" only return instructions which are trivial (meaning no virtual register uses), but some targets do lie about that today.
6 days[NFC][MC][CodeEmitterGen] Extract error reporting into a helper function ↵Rahul Joshi1-1/+0
(#159778) Extract error reporting code emitted by CodeEmitterGen into MCCodeEmitter static members functions. Additionally, remove unused ErrorHandling.h header from several files.
6 days[NFC][MC][ARM] Reorder decoder functions N/N (#158767)Rahul Joshi1-65/+59
Move `DecodeT2AddrModeImm8` and `DecodeT2Imm8` definition before its first use and eliminate the last remaining forward declarations of decode functions. Work on https://github.com/llvm/llvm-project/issues/156560 : Reorder ARM disassembler decode functions to eliminate forward declarations
6 days[ARM] Auto-decode s_cc_out operand (#159956)Sergei Barannikov2-28/+14
The operand can be decoded automatically, without the need for post-decoding instruction modification. Part of #156540.
10 days[ARM] Replace ABS and tABS machine nodes with custom lowering (#156717)AZero138-151/+60
Just do a custom lowering instead. Also copy paste the cmov-neg fold to prevent regressions in nabs.
10 days[ARM] Verify that disassembled instruction is correct (#157360)Sergei Barannikov1-41/+27
This change adds basic `MCInst` verification (checks the number of operands) and fixes detected bugs. * `RFE*` instructions have only one operand, but `DecodeRFEInstruction` added two. * `DecodeMVEModImmInstruction` and `DecodeMVEVCMP` added a `vpred` operand, but this is what `AddThumbPredicate` normally does. This resulted in an extra `vpred` operand. * `DecodeMVEVADCInstruction` added an extra immediate operand. * `getARMInstruction` added a `pred` operand to instructions that don't have one (via `DecodePredicateOperand`). * `AddThumb1SBit` appended an extra register operand to instructions that don't modify CPSR (such as `tBL`). * Instructions in `NEONDup` namespace have `pred` operand that the generated code successfully decodes. The operand was added once again by `getARMInstruction`/`getThumbInstruction` via `AddThumbPredicate`. Functional changes extracted from #156540.
10 days[CodeGen][NewPM] Port `ReachingDefAnalysis` to new pass manager. (#159572)Mikhail Gudim3-63/+61
In this commit: (1) Added new pass manager support for `ReachingDefAnalysis`. (2) Added printer pass. (3) Make old pass manager use `ReachingDefInfoWrapperPass`
2025-09-12CodeGen: Remove MachineFunction argument from getRegClass (#158188)Matt Arsenault5-12/+9
This is a low level utility to parse the MCInstrInfo and should not depend on the state of the function.
2025-09-12CodeGen: Remove MachineFunction argument from getPointerRegClass (#158185)Matt Arsenault6-15/+15
getPointerRegClass is a layering violation. Its primary purpose is to determine how to interpret an MCInstrDesc's operands RegClass fields. This should be context free, and only depend on the subtarget. The model of this is also wrong, since this should be an instruction / operand specific property, not a global pointer class. Remove the the function argument to help stage removal of this hook and avoid introducing any new obstacles to replacing it. The remaining uses of the function were to get the subtarget, which TargetRegisterInfo already belongs to. A few targets needed new subtarget derived properties copied there.
2025-09-11[llvm] Move data layout string computation to TargetParser (#157612)Reid Kleckner2-65/+9
Clang and other frontends generally need the LLVM data layout string in order to generate LLVM IR modules for LLVM. MLIR clients often need it as well, since MLIR users often lower to LLVM IR. Before this change, the LLVM datalayout string was computed in the LLVM${TGT}CodeGen library in the relevant TargetMachine subclass. However, none of the logic for computing the data layout string requires any details of code generation. Clients who want to avoid duplicating this information were forced to link in LLVMCodeGen and all registered targets, leading to bloated binaries. This happened in PR #145899, which measurably increased binary size for some of our users. By moving this information to the TargetParser library, we can delete the duplicate datalayout strings in Clang, and retain the ability to generate IR for unregistered targets. This is intended to be a very mechanical LLVM-only change, but there is an immediately obvious follow-up to clang, which will be prepared separately. The vast majority of data layouts are computable with two inputs: the triple and the "ABI name". There is only one exception, NVPTX, which has a cl::opt to enable short device pointers. I invented a "shortptr" ABI name to pass this option through the target independent interface. Everything else fits. Mips is a bit awkward because it uses a special MipsABIInfo abstraction, which includes members with codegen-like concepts like ABI physical registers that can't live in TargetParser. I think the string logic of looking for "n32" "n64" etc is reasonable to duplicate. We have plenty of other minor duplication to preserve layering. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-09-11[ARM] Allow s constraints on half (#157860)Nikita Popov1-3/+5
Fix a regression from https://github.com/llvm/llvm-project/pull/147559.
2025-09-08[NFC][MC][ARM] Reorder decoder functions 5/N (#156920)Rahul Joshi1-293/+263
Move all decode functions (except `DecodeT2AddrModeImm8`) that had forward declarations around so that they are defined before their first use and not need a forward declaration. Work on https://github.com/llvm/llvm-project/issues/156560 : Reorder ARM disassembler decode functions to eliminate forward declarations
2025-09-08CodeGen: Pass SubtargetInfo to TargetGenInstrInfo constructors (#157337)Matt Arsenault1-3/+3
This will make it possible for tablegen to make subtarget dependent decisions without adding new arguments to every target. --------- Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-09-06[SelectionDAG][ARM] Propagate fast math flags in visitBRCOND (#156647)paperchalice1-4/+7
Factor out from #151275.
2025-09-05[DAG][ARM] canCreateUndefOrPoisonForTargetNode - ARMISD VORRIMM\VBICIMM ↵woruyu2-0/+17
nodes can't create poison/undef (#156831) ### Summary This PR resolves https://github.com/llvm/llvm-project/issues/156640
2025-09-04[TableGen][Decoder] Decode operands with zero width or all bits known (#156358)Sergei Barannikov1-1/+2
There are two classes of operands that DecoderEmitter cannot currently handle: 1. Operands that do not participate in instruction encoding. 2. Operands whose encoding contains only 1s and 0s. Because of this, targets developed various workarounds. Some targets insert missing operands after an instruction has been (incompletely) decoded, other take into account the missing operands when printing the instruction. Some targets do neither of that and fail to correctly disassemble some instructions. This patch makes it possible to decode both classes of operands and allows to remove existing workarounds. For the case of operand with no contribution to instruction encoding, one should now add `bits<0> OpName` field to instruction encoding record. This will make DecoderEmitter generate a call to the decoder function specified by the operand's DecoderMethod. The function has a signature different from the usual one and looks like this: ``` static DecodeStatus DecodeImm42Operand(MCInst &Inst, const MCDisassembler *Decoder) { Inst.addOperand(MCOperand::createImm(42)); return DecodeStatus::Success; } ``` Notably, encoding bits are not passed to it (since there are none). There is nothing special about the second case, the operand bits are passed as usual. The difference is that before this change, the function was not called if all the bits of the operand were known (no '?' in the operand encoding). There are two options controlling the behavior. Passing an option enables the old behavior. They exist to allow smooth transition to the new behavior. They are temporary (yeah, I know) and will be removed once all targets migrate, possibly giving some more time to downstream targets. Subsequent patches in the stack enable the new behavior on some in-tree targets.
2025-09-04[NFC][MC][ARM] Reorder decoder functions 4/N (#156690)Rahul Joshi1-528/+39
2025-09-04[DAG][ARM] ComputeKnownBitsForTargetNode - add handling for ARMISD ↵woruyu1-0/+23
VORRIMM\VBICIMM nodes (#149494) ### Summary This PR resolves https://github.com/llvm/llvm-project/issues/147179
2025-09-04[CodeGen] Remove ExpandInlineAsm hook (#156617)Nikita Popov2-33/+0
This hook replaces inline asm with LLVM intrinsics. It was intended to match inline assembly implementations of bswap in libc headers and replace them more optimizable implementations. At this point, it has outlived its usefulness (see https://github.com/llvm/llvm-project/issues/156571#issuecomment-3247638412), as libc implementations no longer use inline assembly for this purpose. Additionally, it breaks the "black box" property of inline assembly, which some languages like Rust would like to guarantee. Fixes https://github.com/llvm/llvm-project/issues/156571.
2025-09-03[NFC][LLVM] Use `INITILIZE_PASS` instead of `INITIALIZE_PASS_BEGIN/END` ↵Rahul Joshi1-2/+1
(#156212)
2025-09-03[NFC][MC][ARM] Rearrange decoder functions 3/N (#156240)Rahul Joshi1-256/+256
2025-09-02[NFC] RuntimeLibcalls: Prefix the impls with 'Impl_' (#153850)Daniel Paoliello1-32/+32
As noted in #153256, TableGen is generating reserved names for RuntimeLibcalls, which resulted in a build failure for Arm64EC since `vcruntime.h` defines `__security_check_cookie` as a macro. To avoid using reserved names, all impl names will now be prefixed with `Impl_`. `NumLibcallImpls` was lifted out as a `constexpr size_t` instead of being an enum field. While I was churning the dependent code, I also removed the TODO to move the impl enum into its own namespace and use an `enum class`: I experimented with using an `enum class` and adding a namespace, but we decided it was too verbose so it was dropped.
2025-09-01[LV] Bundle sub reductions into VPExpressionRecipe (#147255)Sam Tebbs2-4/+8
This PR bundles sub reductions into the VPExpressionRecipe class and adjusts the cost functions to take the negation into account. Stacked PRs: 1. https://github.com/llvm/llvm-project/pull/147026 2. -> https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/147302 4. https://github.com/llvm/llvm-project/pull/147513
2025-08-31[ARM] Use t2LDRLIT_ga_pcrel for loading stack guards with no-movt in PIC ↵Amara Emerson1-2/+5
mode. (#156208) When using no-movt we don't use the pcrel version of the literal load. This change also unifies logic with the ARM version of this function as well, which has: ``` if (!Subtarget.useMovt() || ForceELFGOTPIC) { // For ELF non-PIC, use GOT PIC code sequence as well because R_ARM_GOT_ABS // does not have assembler support. if (TM.isPositionIndependent() || ForceELFGOTPIC) expandLoadStackGuardBase(MI, ARM::LDRLIT_ga_pcrel, ARM::LDRi12); else expandLoadStackGuardBase(MI, ARM::LDRLIT_ga_abs, ARM::LDRi12); return; } ``` rdar://138334512
2025-08-31[ARM] Simplify LowerCMP (NFC) (#156198)AZero131-12/+4
Pass the opcode directly.
2025-08-31[NFC][ARM][MC] Rearrange decoder functions 2/N (#155464)Rahul Joshi1-254/+254
Move some of the non-static-decode functions to the end of the file. Note: moving `ARMDisassembler::AddThumbPredicate` the same way causes the diff to be non-trivial, so not doing that here.
2025-08-31[TableGen][Decoder] Remove special case of single sub-op dag (#156175)Sergei Barannikov1-2/+0
If a custom operand has MIOperandInfo with >= 2 sub-operands, it is required that either the operand or its sub-operands have a decoder method (depending on usage). Require this for single sub-operand operands as well, since there is no good reason not to. There are no changes in the generated files.
2025-08-30[ARM] Remove an unnecessary cast (NFC) (#156203)Kazu Hirata1-1/+1
getInstrInfo() already returns const ARMBaseInstrInfo *.
2025-08-30[TableGen] Require complex operands in InstAlias to be specified as DAGs ↵Sergei Barannikov5-19/+20
(#136411) Currently, complex operands of an instruction are flattened in the resulting DAG of `InstAlias`. This change makes it required to specify complex operands in `InstAlias` as sub-DAGs: ``` InstAlias<"foo $rd, $rs1, $rs2", (Inst RC:$rd, (ComplexOp RC:$rs1, GR0, 42), SimpleOp:$rs2)>; ``` instead of ``` InstAlias<"foo $rd, $rs1, $rs2", (Inst RC:$rd, RC:$rs1, GR0, 42, SimpleOp:$rs2)>; ``` The advantages of the new syntax are improved readability and more robust type checking, although it is a bit more verbose.
2025-08-27[ARM] Remove an unnecessary cast (NFC) (#155552)Kazu Hirata1-1/+1
getSUnit() already returns SUnit *.
2025-08-26[IA][RISCV] Recognize interleaving stores that could lower to strided ↵Min-Yih Hsu2-4/+6
segmented stores (#154647) This is a sibling patch to #151612: passing gap masks to the renewal TLI hooks for lowering interleaved stores that use shufflevector to do the interleaving.
2025-08-26[NFC][MC][ARM] Rearrange decode functions in ARM disassembler (#154988)Rahul Joshi1-36/+36
Move `tryAddingSymbolicOperand` and `tryAddingPcLoadReferenceComment` to before including the generated disassembler code. This is in preparation for rearranging the decoder functions to eliminate forward declarations.
2025-08-25[ARM] Set isCheapToSpeculateCtlz as true for hasV5TOps and no Thumb 1 (#154848)AZero133-19/+3
This is so that we don't expand to include unneeded 0 checks. Also fix the logic error in LegalizerInfo so it is NOT legal on Thumb1 in Fast-ISEL. Finally, Remove the README entry regarding this issue.
2025-08-25[ARM] Remove an unnecessary cast (NFC) (#155206)Kazu Hirata1-1/+1
getType() already returns Type *.
2025-08-24Remove SDNPSideEffect from ARMcallseq_start and ARMcallseq_end (NFC) (#153248)AZero131-3/+2
A call sequence does not have any unmodeled side effects in of itself. ADJCALLSTACKUP and ADJCALLSTACKDOWN do, however, so the attribute should be there.
2025-08-23RuntimeLibcalls: Add entries for stackprotector globals (#154930)Matt Arsenault2-15/+0
Add entries for_stack_chk_guard, __ssp_canary_word, __security_cookie, and __guard_local. As far as I can tell these are all just different names for the same shaped functionality on different systems. These aren't really functions, but special global variable names. They should probably be treated the same way; all the same contexts that need to know about emittable function names also need to know about this. This avoids a special case check in IRSymtab. This isn't a complete change, there's a lot more cleanup which should be done. The stack protector configuration system is a complete mess. There are multiple overlapping controls, used in 3 different places. Some of the target control implementations overlap with conditions used in the emission points, and some use correlated but not identical conditions in different contexts. i.e. useLoadStackGuardNode, getIRStackGuard, getSSPStackGuardCheck and insertSSPDeclarations are all used in inconsistent ways so I don't know if I've tracked the intention of the system correctly. The PowerPC test change is a bug fix on linux. Previously the manual conditions were based around !isOSOpenBSD, which is not the condition where __stack_chk_guard are used. Now getSDagStackGuard returns the proper global reference, resulting in LOAD_STACK_GUARD getting a MachineMemOperand which allows scheduling.
2025-08-22[llvm] Remove unused includes of SmallSet.h (NFC) (#154893)Kazu Hirata1-1/+0
We just replaced SmallSet<T *, N> with SmallPtrSet<T *, N>, bypassing the redirection found in SmallSet.h. With that, we no longer need to include SmallSet.h in many files.
2025-08-22ARM: Remove unneeded ARM::fixup_arm_thumb_bl special caseFangrui Song1-11/+0
This is a weird special case added in 2015, simplifying an even older condition. It is a no-op for ELF (isExternal is always false) and seems unneeded for non-ELF.
2025-08-21MC: Avoid MCSymbol::isExportedFangrui Song1-7/+9
This bit is only used by COFF/MachO. The upcoming change will move isExported/setExported to MCSymbolCOFF/MCSymbolMachO.
2025-08-21[NFC][MC][Decoder] Extract fixed pieces of decoder code into new header file ↵Rahul Joshi1-0/+2
(#154802) Extract fixed functions generated by decoder emitter into a new MCDecoder.h header.
2025-08-21[NFC][MC][ARM] Fix formatting for `ITStatus` and `VPTStatus` (#154815)Rahul Joshi1-80/+66
2025-08-21[ARM][Disassembler] Advance IT State when instruction is unknown (#154531)Peter Smith1-0/+4
When an instruction that the disassembler does not recognize is in an IT block, we should still advance the IT state otherwise the IT state spills over into the next recognized instruction, which is incorrect. We want to avoid disassembly like: it eq <unknown> // Often because disassembler has insufficient target info. addeq r0,r0,r0 // eq spills over into add. Fixes #150569