aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Target/PowerPC
AgeCommit message (Collapse)AuthorFilesLines
3 days[PowerPC] Implement Elliptic Curve Cryptography (ECC) Instructions (#158362)Lei Huang1-0/+175
New instructions added: * xxmulmul * xxmulmulhiadd * xxmulmulloadd * xxssumudm * xxssumudmc * xxssumudmcext * xsaddadduqm * xsaddaddsuqm * xsaddsubuqm * xsaddsubsuqm * xsmerge2t1uqm * xsmerge2t2uqm * xsmerge2t3uqm * xsmerge3t1uqm * xsrebase2t1uqm * xsrebase2t2uqm * xsrebase2t3uqm * xsrebase2t4uqm * xsrebase3t1uqm * xsrebase3t2uqm * xsrebase3t3uqm
7 days[NFC][PowerPC] Consolidate predicate definitions into PPC.td (#160579)Lei Huang7-71/+66
Consolidate predicate definitions into top level entry point for PowerPC target `PPC.td` and remove duplicate definitions for 32/64 bit sub-target checks.
8 days[NFC][PowerPC] Fix err in instruction class name for stxvp (#160764)Lei Huang2-7/+7
9 days[PowerPC][NFC] Simplify vector unpacked instr classes (#160564)Lei Huang1-35/+15
Apply suggestion as per review comment in https://github.com/llvm/llvm-project/pull/151004/files#r2240893226
9 days[PowerPC] Implement VSX Vector Integer Arithmetic Instructions (#158363)Lei Huang1-0/+41
New instructions added: * xvadduwm - VSX Vector Add UnsignedWord Modulo * xvadduhm - VSXVectorAddUnsigned HalfwordModulo * xvsubuwm - VSXVectorSubtract UnsignedWord Modulo * xvsubuhm - VSX Vector SubtractUnsigned HalfwordModulo * xvmuluwm - VSX Vector MultiplyUnsigned WordModulo * xvmuluhm - VSXVectorMultiply Unsigned Halfword Modulo * xvmulhsw - VSX Vector MultiplyHigh SignedWord * xvmulhsh - VSX Vector Multiply HighSigned Halfword * xvmulhuw - VSX Vector Multiply HighUnsigned Word * xvmulhuh - VSX Vector MultiplyHigh UnsignedHalfword
10 days[CodeGen] Rename isReallyTriviallyReMaterializable [nfc]Philip Reames2-3/+3
.. to isReMaterializableImpl. The "Really" naming has always been awkward, and we're working towards removing the "Trivial" part now, so go ehead and remove both pieces in a single rename. Note that this doesn't change any aspect of the current implementation; we still "mostly" only return instructions which are trivial (meaning no virtual register uses), but some targets do lie about that today.
10 days[NFC][MC][CodeEmitterGen] Extract error reporting into a helper function ↵Rahul Joshi1-1/+0
(#159778) Extract error reporting code emitted by CodeEmitterGen into MCCodeEmitter static members functions. Additionally, remove unused ErrorHandling.h header from several files.
10 days[NFC][PowerPC] Move Anonymous Patterns up for consistency (#160322)Lei Huang1-18/+15
10 days[PowerPC] Implement AES Acceleration Instructions (#157725)Lei Huang1-0/+115
Implement AES Acceleration Instructions: * xxaesencp * xxaesdecp * xxaesgenlkp * xxgfmul128
11 days[MIR] Support save/restore points with independent sets of registers (#119358)Elizaveta Noskova1-2/+1
This patch adds the MIR parsing and serialization support for save and restore points with subsets of callee saved registers. That is, it syntactically allows a function to contain two or more distinct sub-regions in which distinct subsets of registers are spilled/filled as callee save. This is useful if e.g. one of the CSRs isn't modified in one of the sub-regions, but is in the other(s). Support for actually using this capability in code generation is still forthcoming. This patch is the next logical step for multiple save/restore points support. All points are now stored in DenseMap from MBB to vector of CalleeSavedInfo. Shrink-Wrap points split Part 4. RFC: https://discourse.llvm.org/t/shrink-wrap-save-restore-points-splitting/83581 Part 1: https://github.com/llvm/llvm-project/pull/117862 (landed) Part 2: https://github.com/llvm/llvm-project/pull/119355 (landed) Part 3: https://github.com/llvm/llvm-project/pull/119357 (landed) Part 5: https://github.com/llvm/llvm-project/pull/119359 (likely to be further split)
11 days[PowerPC] Avoid working on deleted node in ext bool trunc combine (#160050)Nikita Popov1-4/+6
This code was already creating HandleSDNodes to handle the case where a node gets replaced with an equivalent node. However, the code before the handles are created also performs RAUW operations, which can end up CSEing and deleting nodes. Fix this issue by moving the handle creation earlier. Fixes https://github.com/llvm/llvm-project/issues/160040.
11 days[PowerPC] Exploit xxeval instruction for operations of the form ternary(A,X, ↵Tony Varghese1-1/+88
XOR(B,C)) and ternary(A,X, OR(B,C)) (#157909) Adds support for ternary equivalent operations of the form - `ternary(A, X, xor(B,C))` where `X=[and(B,C)| nor(B,C)| or(B,C)| B | C]`. - `ternary(A, X, or(B,C))` where `X = [and(B,C)| eqv(B,C)| not(B)| not(C)| nand(B,C)| B | C]`. The following are the patterns involved and the imm values: ``` ternary(A, and(B,C), xor(B,C)) 97 ternary(A, B, xor(B,C)) 99 ternary(A, C, xor(B,C)) 101 ternary(A, or(B,C), xor(B,C)) 103 ternary(A, nor(B,C), xor(B,C)) 104 ternary(A, and(B,C), or(B,C)) 113 ternary(A, B, or(B,C)) 115 ternary(A, C, or(B,C)) 117 ternary(A, eqv(B,C), or(B,C)) 121 ternary(A, not(C), or(B,C)) 122 ternary(A, not(B), or(B,C)) 124 ternary(A, nand(B,C), or(B,C)) 126 ``` eg. `xxeval XT, XA, XB, XC, 97` performs the ternary operation: `XA ? and(XB, XC) : xor(XB, XC)` and places the result in `XT`. This is the continuation of: - [[PowerPC] Exploit xxeval instruction for ternary patterns - ternary(A, X, and(B,C))](https://github.com/llvm/llvm-project/pull/141733#top) - [[PowerPC] Exploit xxeval instruction for operations of the form ternary(A,X,B) and ternary(A,X,C).](https://github.com/llvm/llvm-project/pull/152956#top) --------- Co-authored-by: Tony Varghese <tony.varghese@ibm.com>
14 daysPPC: Fix regression for 32-bit ppc with 64-bit support (#159893)Matt Arsenault1-1/+1
Fixes regression after e5bbaa9c8fb6e06dbcbd39404039cc5d31df4410. e5500 accidentally still had the 64bit feature applied instead of 64bit-support.
2025-09-19Revert "[PowerPC] clean unused PPC target feature FeatureBPERMD" (#159837)Sergei Barannikov1-1/+4
Reverts llvm/llvm-project#159782 The PR breaks multiple build bots and CI as well.
2025-09-19[PowerPC] clean unused PPC target feature FeatureBPERMD (#159782)zhijian lin1-4/+1
clean unused PPC target feature FeatureBPERMD.
2025-09-19PPC: Replace PointerLikeRegClass with RegClassByHwMode (#158777)Matt Arsenault4-24/+26
2025-09-19[PowerPC] Fix vector extend result types in BUILD_VECTOR lowering (#159398)RolandF771-1/+5
The result type of the vector extend intrinsics generated by the BUILD_VECTOR lowering code should match how they are actually defined. Currently the result type is defaulting to the operand type there. This can conflict with calls to the same intrinsic from other paths.
2025-09-19[PowerPC] using milicode call for strlen instead of lib call (#153600)zhijian lin2-0/+10
AIX has "millicode" routines, which are functions loaded at boot time into fixed addresses in kernel memory. This allows them to be customized for the processor. The __strlen routine is a millicode implementation; we use millicode for the strlen function instead of a library call to improve performance.
2025-09-19[LLVM][CodeGen] Update PPCFastISel::SelectRet for ConstantInt based vectors. ↵Paul Walker1-1/+2
(#159331) The current implementation assumes ConstantInt return values are scalar, which is not true when use-constant-int-for-fixed-length-splat is enabled.
2025-09-16[PowerPC] Add intrinsic definition for load and store with Right Length ↵Lei Huang1-0/+18
Left-justified (#148873)
2025-09-16PPC: Split 64bit target feature into 64bit and 64bit-support (#157206)Matt Arsenault3-24/+21
This was being used for 2 different purposes. The TargetMachine constructor prepends +64bit based on isPPC64 triples as a mode switch. The same feature name was also explicitly added to different processors, making it impossible to perform a pure feature check for whether 64-bit mode is enabled ir not. i.e., checkFeatures("+64bit") would be true even for ppc32 triples. The comment in tablegen suggests it's relevant to track which processors support 64-bit mode independently of whether that's the active compile target, so replace that with a new feature.
2025-09-16PPC: Move definitions of predicates with features (#157058)Matt Arsenault2-36/+37
The way this was previously structured does not allow access to the predicates inside of PPCRegisterInfo
2025-09-15[NFC][DecoderEmitter] Code cleanup in `DecoderEmitter::emitTable` (#158014)Rahul Joshi1-0/+1
Several code cleanup changes in code to emit decoder tables: - Start comments on each line at a fixed column for readibility. - Combine repeated code to decode and emit ULEB128 into a single function. - Add helper `getDecoderOpName` to print decoder op. - Print Filter/CheckField/predicate index values with those opcodes.
2025-09-12CodeGen: Remove MachineFunction argument from getRegClass (#158188)Matt Arsenault1-3/+2
This is a low level utility to parse the MCInstrInfo and should not depend on the state of the function.
2025-09-12CodeGen: Remove MachineFunction argument from getPointerRegClass (#158185)Matt Arsenault3-5/+4
getPointerRegClass is a layering violation. Its primary purpose is to determine how to interpret an MCInstrDesc's operands RegClass fields. This should be context free, and only depend on the subtarget. The model of this is also wrong, since this should be an instruction / operand specific property, not a global pointer class. Remove the the function argument to help stage removal of this hook and avoid introducing any new obstacles to replacing it. The remaining uses of the function were to get the subtarget, which TargetRegisterInfo already belongs to. A few targets needed new subtarget derived properties copied there.
2025-09-11[llvm] Move data layout string computation to TargetParser (#157612)Reid Kleckner1-54/+2
Clang and other frontends generally need the LLVM data layout string in order to generate LLVM IR modules for LLVM. MLIR clients often need it as well, since MLIR users often lower to LLVM IR. Before this change, the LLVM datalayout string was computed in the LLVM${TGT}CodeGen library in the relevant TargetMachine subclass. However, none of the logic for computing the data layout string requires any details of code generation. Clients who want to avoid duplicating this information were forced to link in LLVMCodeGen and all registered targets, leading to bloated binaries. This happened in PR #145899, which measurably increased binary size for some of our users. By moving this information to the TargetParser library, we can delete the duplicate datalayout strings in Clang, and retain the ability to generate IR for unregistered targets. This is intended to be a very mechanical LLVM-only change, but there is an immediately obvious follow-up to clang, which will be prepared separately. The vast majority of data layouts are computable with two inputs: the triple and the "ABI name". There is only one exception, NVPTX, which has a cl::opt to enable short device pointers. I invented a "shortptr" ABI name to pass this option through the target independent interface. Everything else fits. Mips is a bit awkward because it uses a special MipsABIInfo abstraction, which includes members with codegen-like concepts like ABI physical registers that can't live in TargetParser. I think the string logic of looking for "n32" "n64" etc is reasonable to duplicate. We have plenty of other minor duplication to preserve layering. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-09-09[PowerPC] Support `-fpatchable-function-entry` on PPC64LE (#151569)Maryam Moghadas1-5/+7
This patch enables `-fpatchable-function-entry` on PPC64 little-endian Linux. It is mutually exclusive with existing XRay instrumentation on this target.
2025-09-08PPC: Use StringRef for subtarget constructor arguments (#157409)Matt Arsenault2-5/+3
2025-09-08PPC: Remove TargetTriple from PPCSubtarget (#157404)Matt Arsenault2-17/+12
This already exists in the base class.
2025-09-08CodeGen: Pass SubtargetInfo to TargetGenInstrInfo constructors (#157337)Matt Arsenault1-1/+1
This will make it possible for tablegen to make subtarget dependent decisions without adding new arguments to every target. --------- Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-09-06PPC: Fix missing const on TargetInstrInfo's subtarget reference (#157201)Matt Arsenault2-3/+3
2025-09-05[PowerPC][NFC] Apply clang-format to PPCInstrFuture.td (#157135)Lei Huang1-35/+36
2025-09-04[PowerPC][NFC] Update TableGen range punctuator with '...' (#156893)Lei Huang6-786/+786
The '-' punctuator was deprecated via: https://github.com/llvm/llvm-project/commit/196e6f9f18933ed33eee39a1c9350ccce6b18e2c
2025-09-04[PowerPC] Remove non-existent operand of CP_COPY instruction (#153867)Sergei Barannikov1-1/+1
The operand is not encoded, decoded, or printed and would break MCInst verification if we had one. Extracted from #156358, where the extra operand causes DecoderEmitter to emit an error about an operand with a missing encoding.
2025-09-03[NFC] Apply clang-format to PPCInstrFutureMMA.td (#156749)Lei Huang1-294/+296
2025-09-03[PowerPC][NFC] Refactor PPCInstrFutureMMA.td to combine sections (#151194)Lei Huang1-136/+141
Combine same predicate sections into one and move some mma instructions into the proper section.
2025-09-03[PowerPC] Remove an unnecessary cast (NFC) (#156599)Kazu Hirata1-1/+1
getSExtValue already returns int64_t.
2025-09-02[PowerPC] Implement vector uncompress instructions (#150702)Lei Huang1-0/+31
Implement the set of vector uncompress instructions: * vucmprhh * vucmprlh * vucmprhn * vucmprln * vucmprhb * vucmprlb
2025-09-02[PowerPC] Implement vector unpack instructions (#151004)Lei Huang1-0/+78
Implement the set of vector uncompress instructions: * vupkhsntob * vupklsntob * vupkint4tobf16 * vupkint8tobf16 * vupkint4tofp32 * vupkint8tofp32
2025-09-01[PowerPC] Merge vsr(vsro(input, byte_shift), bit_shift) to vsrq(input, ↵Tony Varghese5-1/+23
res_bit_shift) (#154388) This change implements a patfrag based pattern matching ~dag combiner~ that combines consecutive `VSRO (Vector Shift Right Octet)` and `VSR (Vector Shift Right)` instructions into a single `VSRQ (Vector Shift Right Quadword)` instruction on Power10+ processors. Vector right shift operations like `vec_srl(vec_sro(input, byte_shift), bit_shift)` generate two separate instructions `(VSRO + VSR)` when they could be optimised into a single `VSRQ `instruction that performs the equivalent operation. ``` vsr(vsro (input, vsro_byte_shift), vsr_bit_shift) to vsrq(input, vsrq_bit_shift) where vsrq_bit_shift = (vsro_byte_shift * 8) + vsr_bit_shift ``` Note: ``` vsro : Vector Shift Right by Octet VX-form - vsro VRT, VRA, VRB - The contents of VSR[VRA+32] are shifted right by the number of bytes specified in bits 121:124 of VSR[VRB+32]. - Bytes shifted out of byte 15 are lost. - Zeros are supplied to the vacated bytes on the left. - The result is placed into VSR[VRT+32]. vsr : Vector Shift Right VX-form - vsr VRT, VRA, VRB - The contents of VSR[VRA+32] are shifted right by the number of bits specified in bits 125:127 of VSR[VRB+32]. 3 bits. - Bits shifted out of bit 127 are lost. - Zeros are supplied to the vacated bits on the left. - The result is place into VSR[VRT+32], except if, for any byte element in VSR[VRB+32], the low-order 3 bits are not equal to the shift amount, then VSR[VRT+32] is undefined. vsrq : Vector Shift Right Quadword VX-form - vsrq VRT,VRA,VRB - Let src1 be the contents of VSR[VRA+32]. Let src2 be the contents of VSR[VRB+32]. - src1 is shifted right by the number of bits specified in the low-order 7 bits of src2. - Bits shifted out the least-significant bit are lost. - Zeros are supplied to the vacated bits on the left. - The result is placed into VSR[VRT+32]. ``` --------- Co-authored-by: Tony Varghese <tony.varghese@ibm.com>
2025-09-01[PowerPC] Exploit xxeval instruction for operations of the form ↵Tony Varghese1-0/+59
ternary(A,X,B) and ternary(A,X,C). (#152956) Adds support for ternary equivalent operations of the form `ternary(A, X, B)` and `ternary(A, X, C)` where `X=[and(B,C)| nor(B,C)| eqv(B,C)| nand(B,C)]`. The following are the patterns involved and the imm values: | **Operation** | **Immediate Value** | |----------------------------|---------------------| | ternary(A, and(B,C), B) | 49 | | ternary(A, nor(B,C), B) | 56 | | ternary(A, eqv(B,C), B) | 57 | | ternary(A, nand(B,C), B) | 62 | | | | | ternary(A, and(B,C), C) | 81 | | ternary(A, nor(B,C), C) | 88 | | ternary(A, eqv(B,C), C) | 89 | | ternary(A, nand(B,C), C) | 94 | eg. `xxeval XT, XA, XB, XC, 49` - performs `XA ? and(XB, XC) : B`and places the result in `XT`. This is the continuation of [[PowerPC] Exploit xxeval instruction for ternary patterns - ternary(A, X, and(B,C))](https://github.com/llvm/llvm-project/pull/141733#top). --------- Co-authored-by: Tony Varghese <tony.varghese@ibm.com>
2025-08-31[TableGen][Decoder] Remove special case of single sub-op dag (#156175)Sergei Barannikov1-16/+16
If a custom operand has MIOperandInfo with >= 2 sub-operands, it is required that either the operand or its sub-operands have a decoder method (depending on usage). Require this for single sub-operand operands as well, since there is no good reason not to. There are no changes in the generated files.
2025-08-30[TableGen] Require complex operands in InstAlias to be specified as DAGs ↵Sergei Barannikov1-16/+16
(#136411) Currently, complex operands of an instruction are flattened in the resulting DAG of `InstAlias`. This change makes it required to specify complex operands in `InstAlias` as sub-DAGs: ``` InstAlias<"foo $rd, $rs1, $rs2", (Inst RC:$rd, (ComplexOp RC:$rs1, GR0, 42), SimpleOp:$rs2)>; ``` instead of ``` InstAlias<"foo $rd, $rs1, $rs2", (Inst RC:$rd, RC:$rs1, GR0, 42, SimpleOp:$rs2)>; ``` The advantages of the new syntax are improved readability and more robust type checking, although it is a bit more verbose.
2025-08-30[TableGen][CodeGen] Remove DisableEncoding field of Instruction class (#156098)Sergei Barannikov8-204/+169
I believe it became no-op with the removal of the "positionally encoded operands" functionality (b87dc356 is the last commit in the series). There are no changes in the generated files.
2025-08-27[PowerPC] Add DMR and WACC COPY support (#149129)Maryam Moghadas6-164/+230
This patch updates PPCInstrInfo::copyPhysReg to support DMR and WACC register classes and extends the PPCVSXCopy pass to handle specific WACC copy patterns.
2025-08-25[PowerPC] Indicate that PPC32PICGOT clobbers LR (#154654)Josh Stone1-1/+2
This pseudo-instruction emits a local `bl` writing LR, so that must be saved and restored for the function to return to the right place. If not, we'll return to the inline `.long` that the `bl` stepped over. This fixes the `SIGILL` seen in rayon-rs/rayon#1268.
2025-08-25[PowerPC] Add DMF builtins for build and disassemble (#153097)RolandF772-0/+42
Add support for PPC Dense Math builtins mma_build_dmr and mma_disassemble_dmr builtins.
2025-08-23RuntimeLibcalls: Add entries for stackprotector globals (#154930)Matt Arsenault2-22/+0
Add entries for_stack_chk_guard, __ssp_canary_word, __security_cookie, and __guard_local. As far as I can tell these are all just different names for the same shaped functionality on different systems. These aren't really functions, but special global variable names. They should probably be treated the same way; all the same contexts that need to know about emittable function names also need to know about this. This avoids a special case check in IRSymtab. This isn't a complete change, there's a lot more cleanup which should be done. The stack protector configuration system is a complete mess. There are multiple overlapping controls, used in 3 different places. Some of the target control implementations overlap with conditions used in the emission points, and some use correlated but not identical conditions in different contexts. i.e. useLoadStackGuardNode, getIRStackGuard, getSSPStackGuardCheck and insertSSPDeclarations are all used in inconsistent ways so I don't know if I've tracked the intention of the system correctly. The PowerPC test change is a bug fix on linux. Previously the manual conditions were based around !isOSOpenBSD, which is not the condition where __stack_chk_guard are used. Now getSDagStackGuard returns the proper global reference, resulting in LOAD_STACK_GUARD getting a MachineMemOperand which allows scheduling.
2025-08-22[llvm] Remove unused includes of SmallSet.h (NFC) (#154893)Kazu Hirata3-3/+0
We just replaced SmallSet<T *, N> with SmallPtrSet<T *, N>, bypassing the redirection found in SmallSet.h. With that, we no longer need to include SmallSet.h in many files.
2025-08-21[NFC][MC][Decoder] Extract fixed pieces of decoder code into new header file ↵Rahul Joshi1-0/+1
(#154802) Extract fixed functions generated by decoder emitter into a new MCDecoder.h header.