aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/CMakeLists.txt
AgeCommit message (Collapse)AuthorFilesLines
2024-03-06Restore "Implement convergence control in MIR using SelectionDAG (#71785)"Sameer Sahasrabuddhe1-0/+1
This restores commit c7fdd8c11e54585dc9d15d63de9742067e0506b9. Previously reverted in f010b1bef4dda2c7082cbb41dbabf1f149cce306. LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows: 1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR. The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR. The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.
2024-03-04Revert "Restore "Implement convergence control in MIR using SelectionDAG ↵Mitch Phillips1-1/+0
(#71785)"" This reverts commit c7fdd8c11e54585dc9d15d63de9742067e0506b9. Reason: Broke the sanitizer buildbots. See the comments at https://github.com/llvm/llvm-project/pull/71785 for more information.
2024-03-04Restore "Implement convergence control in MIR using SelectionDAG (#71785)"Sameer Sahasrabuddhe1-0/+1
Original commit 79889734b940356ab3381423c93ae06f22e772c9. Perviously reverted in commit a2afcd5721869d1d03c8146bae3885b3385ba15e. LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows: 1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR. The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR. The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.
2024-02-26[CodeGen] [ARM] Make RISC-V Init Undef Pass Target Independent and add ↵Jack Styles1-0/+1
support for the ARM Architecture. (#77770) When using Greedy Register Allocation, there are times where early-clobber values are ignored, and assigned the same register. This is illeagal behaviour for these intructions. To get around this, using Pseudo instructions for early-clobber registers gives them a definition and allows Greedy to assign them to a different register. This then meets the ARM Architecture Reference Manual and matches the defined behaviour. This patch takes the existing RISC-V patch and makes it target independent, then adds support for the ARM Architecture. Doing this will ensure early-clobber restraints are followed when using the ARM Architecture. Making the pass target independent will also open up possibility that support other architectures can be added in the future.
2024-02-21Revert "Implement convergence control in MIR using SelectionDAG (#71785)"Sameer Sahasrabuddhe1-1/+0
This reverts commit 79889734b940356ab3381423c93ae06f22e772c9. Encountered multiple buildbot failures.
2024-02-21Implement convergence control in MIR using SelectionDAG (#71785)Sameer Sahasrabuddhe1-0/+1
LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows: 1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR. The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR. The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.
2024-01-25[llvm] Move CodeGenTypes library to its own directory (#79444)Nico Weber1-13/+0
Finally addresses https://reviews.llvm.org/D148769#4311232 :) No behavior change.
2024-01-25[CodeGen] Port FreeMachineFunction to new pass manager (#79421)paperchalice1-0/+1
This pass should be the last machine function pass in pipeline, also ignore `PI.runAfterPass(*P, MF, PassPA);` to avoid accessing a dangling reference.
2024-01-24[CodeGen][Passes] Move `CodeGenPassBuilder.h` to Passes (#79242)paperchalice1-1/+0
`CodeGenPassBuilder` is not very tightly coupled to CodeGen, it may need to reference some method in pass builder in future, so move `CodeGenPassBuilder.h` to Passes.
2023-10-27[BasicBlockSections] Apply path cloning with -basic-block-sections. (#68860)Rahman Lavaee1-0/+1
https://github.com/llvm/llvm-project/commit/28b912687900bc0a67cd61c374fce296b09963c4 introduced the path cloning format in the basic-block-sections profile. This PR validates and applies path clonings. A path cloning is valid if all of these conditions hold: 1. All bb ids in the path are mapped to existing blocks. 2. Each two consecutive bb ids in the path have a successor relationship in the CFG. 3. The path does not include a block with indirect branches, except possibly as the last block. Applying a path cloning involves cloning all blocks in the path (except the first one) and setting up their branches. Once all clonings are applied, the cluster information is used to guide block layout in the modified function.
2023-09-03RegAlloc: Rename MLRegalloc* files to use consistent captalizationMatt Arsenault1-4/+4
The other regalloc related files use RegAlloc, not Regalloc.
2023-08-22Add a pass to garbage-collect empty basic blocks after code generation.Rahman Lavaee1-0/+1
Propeller and pseudo-probes map profiles back to Machine IR via basic block addresses that are stored in metadata sections. Empty basic blocks (basic blocks without real code) obfuscate the profile mapping because their addresses collide with their next basic blocks. For instance, the fallthrough block of an empty block should always be adjacent to it. Otherwise, a completely unnecessary jump would be added. This patch adds a MachineFunction pass named `GCEmptyBasicBlocks` which attempts to garbage-collect the empty blocks before the `BasicBlockSections` and pass. This pass removes each empty basic block after redirecting its incoming edges to its fall-through block. The garbage-collection is not complete. We keep the empty block in 4 cases: 1. The empty block is an exception handling pad. 2. The empty block has its address taken. 3. The empty block is the last block of the function and it has predecessors. 4. The empty block is the only block of the function. The first three cases are extremely rare in normal code (no cases for the clang binary). Removing the blocks under the first two cases requires modifying exception handling structures and operands of non-terminator instructions -- which is doable but not worth the additional complexity in the pass. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D107534
2023-05-09[CodeGen][KCFI] Move cfi-type lowering to TargetLoweringSami Tolvanen1-0/+1
KCFI machine function passes transform indirect calls with a cfi-type attribute into architecture-specific type checks bundled together with the calls. Instead of having a separate pass for each architecture, add a generic machine function pass for KCFI and move the architecture-specific code that emits the actual check to TargetLowering. This avoids unnecessary duplication and makes it easier to add KCFI support to other architectures. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D149915
2023-05-03Switch `llvm/CodeGen/MachineValueType.h` to the generated oneNAKAMURA Takumi1-0/+3
Prune `SupportTests/MVTTest` since it is no longer needed. Depends on D148769 Differential Revision: https://reviews.llvm.org/D148770
2023-05-03Split out `CodeGenTypes` from `CodeGen` for LLT/MVTNAKAMURA Takumi1-1/+11
This reduces dependencies on `llvm-tblgen` so much. `CodeGenTypes` depends on `Support` at the moment. Be careful to append deps on this, since Targets' tablegens depend on this. Depends on D149024 Differential Revision: https://reviews.llvm.org/D148769
2023-05-03Restore CodeGen/LowLevelType from `Support`NAKAMURA Takumi1-0/+1
This is rework of; - D30046 (LLT) Since I have introduced `llvm-min-tblgen` as D146352, `llvm-tblgen` may depend on `CodeGen`. `LowLevlType.h` originally belonged to `CodeGen`. Almost all userse are still under `CodeGen` or `Target`. I think `CodeGen` is the right place to put `LowLevelType.h`. `MachineValueType.h` may be moved as well. (later, D149024) I have made many modules depend on `CodeGen`. It is consistent but inefficient. It will be split out later, D148769 Besides, I had to isolate MVT and LLT in modmap, since `llvm::PredicateInfo` clashes between `TableGen/CodeGenSchedule.h` and `Transforms/Utils/PredicateInfo.h`. (I think better to introduce namespace llvm::TableGen) Depends on D145937, D146352, and D148768. Differential Revision: https://reviews.llvm.org/D148767
2023-04-25Move CodeGen/LowLevelType => CodeGen/LowLevelTypeUtilsNAKAMURA Takumi1-1/+1
Before restoring `CodeGen/LowLevelType`, rename this to `LowLevelTypeUtils`. Differential Revision: https://reviews.llvm.org/D148768
2023-02-16[llvm] boilerplate for new callbrprepare codegen IR passNick Desaulniers1-0/+1
Because this pass is to be a codegen pass, it must use the legacy pass manager. Link: https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Reviewed By: aeubanks, void Differential Revision: https://reviews.llvm.org/D139861
2023-01-19[codegen] Add StackFrameLayoutAnalysisPassPaul Kirth1-0/+1
Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488
2023-01-13Revert "[codegen] Add StackFrameLayoutAnalysisPass"Paul Kirth1-1/+0
This breaks on some AArch64 bots This reverts commit 0a652c540556a118bbd9386ed3ab7fd9e60a9754.
2023-01-13[codegen] Add StackFrameLayoutAnalysisPassPaul Kirth1-0/+1
Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488
2023-01-03[TypePromotion] NewPM support.Samuel Parker1-1/+1
Differential Revision: https://reviews.llvm.org/D140893
2022-12-20[Support] Move TargetParsers to new componentArchibald Elliott1-0/+1
This is a fairly large changeset, but it can be broken into a few pieces: - `llvm/Support/*TargetParser*` are all moved from the LLVM Support component into a new LLVM Component called "TargetParser". This potentially enables using tablegen to maintain this information, as is shown in https://reviews.llvm.org/D137517. This cannot currently be done, as llvm-tblgen relies on LLVM's Support component. - This also moves two files from Support which use and depend on information in the TargetParser: - `llvm/Support/Host.{h,cpp}` which contains functions for inspecting the current Host machine for info about it, primarily to support getting the host triple, but also for `-mcpu=native` support in e.g. Clang. This is fairly tightly intertwined with the information in `X86TargetParser.h`, so keeping them in the same component makes sense. - `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains the target triple parser and representation. This is very intertwined with the Arm target parser, because the arm architecture version appears in canonical triples on arm platforms. - I moved the relevant unittests to their own directory. And so, we end up with a single component that has all the information about the following, which to me seems like a unified component: - Triples that LLVM Knows about - Architecture names and CPUs that LLVM knows about - CPU detection logic for LLVM Given this, I have also moved `RISCVISAInfo.h` into this component, as it seems to me to be part of that same set of functionality. If you get link errors in your components after this patch, you likely need to add TargetParser into LLVM_LINK_COMPONENTS in CMake. Differential Revision: https://reviews.llvm.org/D137838
2022-12-20RFC: Uniformity Analysis for Irreducible Control FlowSameer Sahasrabuddhe1-0/+1
Uniformity analysis is a generalization of divergence analysis to include irreducible control flow: 1. The proposed spec presents a notion of "maximal convergence" that captures the existing convention of converging threads at the headers of natual loops. 2. Maximal convergence is then extended to irreducible cycles. The identity of irreducible cycles is determined by the choices made in a depth-first traversal of the control flow graph. Uniformity analysis uses criteria that depend only on closed paths and not cycles, to determine maximal convergence. This makes it a conservative analysis that is independent of the effect of DFS on CycleInfo. 3. The analysis is implemented as a template that can be instantiated for both LLVM IR and Machine IR. Validation: - passes existing tests for divergence analysis - passes new tests with irreducible control flow - passes equivalent tests in MIR and GMIR Based on concepts originally outlined by Nicolai Haehnle <nicolai.haehnle@amd.com> With contributions from Ruiling Song <ruiling.song@amd.com> and Jay Foad <jay.foad@amd.com>. Support for GMIR and lit tests for GMIR/MIR added by Yashwant Singh <yashwant.singh@amd.com>. Differential Revision: https://reviews.llvm.org/D130746
2022-12-15[mlgo] Retire LLVM_HAVE_TF_APIKazu Hirata1-2/+2
I've eliminated all uses of LLVM_HAVE_TF_API except a couple of them being removed in llvm/lib/CodeGen/CMakeLists.txt. This patch removes remaining definitions and uses of LLVM_HAVE_TF_API. Differential Revision: https://reviews.llvm.org/D140169
2022-12-09[Assignment Tracking][Analysis] Add analysis passOCHyams1-0/+1
The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Add initial revision of assignment tracking analysis pass --------------------------------------------------------- This patch squashes five individually reviewed patches into one: #1 https://reviews.llvm.org/D136320 #2 https://reviews.llvm.org/D136321 #3 https://reviews.llvm.org/D136325 #4 https://reviews.llvm.org/D136331 #5 https://reviews.llvm.org/D136335 Patch #1 introduces 2 new files: AssignmentTrackingAnalysis.h and .cpp. The two subsequent patches modify those files only. Patch #4 plumbs the analysis into SelectionDAG, and patch #5 is a collection of tests for the analysis as a whole. The analysis was broken up into smaller chunks for review purposes but for the most part the tests were written using the whole analysis. It would be possible to break up the tests for patches #1 through #3 for the purpose of landing the patches seperately. However, most them would require an update for each patch. In addition, patch #4 - which connects the analysis to SelectionDAG - is required by all of the tests. If there is build-bot trouble, we might try a different landing sequence. Analysis problem and goal ------------------------- Variables values can be stored in memory, or available as SSA values, or both. Using the Assignment Tracking metadata, it's not possible to determine a variable location just by looking at a debug intrinsic in isolation. Instructions without any metadata can change the location of a variable. The meaning of dbg.assign intrinsics changes depending on whether there are linked instructions, and where they are relative to those instructions. So we need to analyse the IR and convert the embedded information into a form that SelectionDAG can consume to produce debug variable locations in MIR. The solution is a dataflow analysis which, aiming to maximise the memory location coverage for variables, outputs a mapping of instruction positions to variable location definitions. API usage --------- The analysis is named `AssignmentTrackingAnalysis`. It is added as a required pass for SelectionDAGISel when assignment tracking is enabled. The results of the analysis are exposed via `getResults` using the returned `const FunctionVarLocs *`'s const methods: const VarLocInfo *single_locs_begin() const; const VarLocInfo *single_locs_end() const; const VarLocInfo *locs_begin(const Instruction *Before) const; const VarLocInfo *locs_end(const Instruction *Before) const; void print(raw_ostream &OS, const Function &Fn) const; Debug intrinsics can be ignored after running the analysis. Instead, variable location definitions that occur between an instruction `Inst` and its predecessor (or block start) can be found by looping over the range: locs_begin(Inst), locs_end(Inst) Similarly, variables with a memory location that is valid for their lifetime can be iterated over using the range: single_locs_begin(), single_locs_end() Further detail -------------- For an explanation of the dataflow implementation and the integration with SelectionDAG, please see the reviews linked at the top of this commit message. Reviewed By: jmorse
2022-12-05Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."Jonas Paulsson1-0/+1
This reverts commit 122efef8ee9be57055d204d52c38700fe933c033. - Patch fixed to not reuse definitions from predecessors in EH landing pads. - Late review suggestions (by MaskRay) have been addressed. - M68k/pipeline.ll test updated. - Init captures added in processBlock() to avoid capturing structured bindings. - RISCV has this disabled for now. Original commit message: A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
2022-12-05Use-after-return sanitizer binary metadataDmitry Vyukov1-0/+1
Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078
2022-12-05Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant ↵Jonas Paulsson1-1/+0
definitions."" This reverts commit 17db0de330f943833296ae72e26fa988bba39cb3. Some more bots got broken - need to investigate.
2022-12-03Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."Jonas Paulsson1-0/+1
Init captures added in processBlock() to avoid capturing structured bindings, which caused the build problems (with clang). RISCV has this disabled for now until problems relating to post RA pseudo expansions are resolved.
2022-12-01Revert "[CodeGen] Add new pass for late cleanup of redundant definitions."Jonas Paulsson1-1/+0
Temporarily revert and fix buildbot failure. This reverts commit 6d12599fd4134c1da63198c74a25490d28c733f6.
2022-12-01[CodeGen] Add new pass for late cleanup of redundant definitions.Jonas Paulsson1-0/+1
A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
2022-12-01[X86] Add ExpandLargeFpConvert Pass and enable for X86Freddy Ye1-0/+1
As stated in https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528, this implementation is very similar to ExpandLargeDivRem, which expands ‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions with a bitwidth above a threshold into auto-generated functions. This is useful for targets like x86_64 that cannot lower fp convertions with more than 128 bits. The expanded nodes are referring from the IR generated by `compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`, and etc. Corner cases: 1. For fp16: as there is no related builtins added in compliler-rt. So I mainly utilized the fp32 <-> fp16 lib calls to implement. 2. For fp80: as this pass is soft fp emulation and no fp80 instructions can help in this problem. I recommend users to deprecate this usage. For now, the implementation uses fp128 as the temporary conversion type and inserts fptrunc/ext at top/end of the function. 3. For bf16: as clang FE currently doesn't support bf16 algorithm operations (convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for now. 4. For unsigned FPToI: since both default hardware behaviors and libgcc are ignoring "returns 0 for negative input" spec. This pass follows this old way to ignore unsigned FPToI. See this example: https://gcc.godbolt.org/z/bnv3jqW1M The end-to-end tests are uploaded at https://reviews.llvm.org/D138261 Reviewed By: LuoYuanke, mgehre-amd Differential Revision: https://reviews.llvm.org/D137241
2022-11-30Revert "Use-after-return sanitizer binary metadata"Marco Elver1-1/+0
This reverts commit d3c851d3fc8b69dda70bf5f999c5b39dc314dd73. Some bots broke: - https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8796062278266465473/overview - https://lab.llvm.org/buildbot/#/builders/124/builds/5759/steps/7/logs/stdio
2022-11-30Use-after-return sanitizer binary metadataDmitry Vyukov1-0/+1
Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078
2022-11-30Revert "Use-after-return sanitizer binary metadata"Dmitry Vyukov1-1/+0
This reverts commit e6aea4a5db09c845276ece92737a6aac97794100. Broke tests: https://lab.llvm.org/buildbot/#/builders/16/builds/38992
2022-11-30Use-after-return sanitizer binary metadataDmitry Vyukov1-0/+1
Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078
2022-11-29Revert "Use-after-return sanitizer binary metadata"Kazu Hirata1-1/+0
This reverts commit a1255dc467f7ce57a966efa76bbbb4ee91d9115a. This patch results in: llvm/lib/CodeGen/SanitizerBinaryMetadata.cpp:57:17: error: no member named 'size' in 'llvm::MDTuple'
2022-11-29Use-after-return sanitizer binary metadataDmitry Vyukov1-0/+1
Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078
2022-11-14[ARM][CodeGen] Add support for complex deinterleavingNicholas Guy1-0/+1
Adds the Complex Deinterleaving Pass implementing support for complex numbers in a target-independent manner, deferring to the TargetLowering for the given target to create a target-specific intrinsic. Differential Revision: https://reviews.llvm.org/D114174
2022-10-21[ObjCARC] Remove legacy PM versions of optimization passesArthur Eubanks1-0/+1
This doesn't touch objc-arc-contract because that's in the codegen pipeline. However, this does move its corresponding initialize function into initializeCodegen(). Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D135041
2022-09-30Reland "[MLGO] ML Regalloc Priority Advisor"Eric Wang1-0/+1
This relands commit 8f4f26ba5bd04f7b335836021e5e63b4236c0305, which was reverted in 91c96a806cae58539e40c9e443a08bde91ccc91e because of Buildbot failures. The previous model test is not compatible with tflite. e.g. https://lab.llvm.org/buildbot/#/builders/6/builds/14041 Differential Revision: https://reviews.llvm.org/D133616
2022-09-29Revert "[MLGO] ML Regalloc Priority Advisor"Mircea Trofin1-1/+0
This reverts commit 8f4f26ba5bd04f7b335836021e5e63b4236c0305. Buildbot failures, e.g. https://lab.llvm.org/buildbot/#/builders/6/builds/14041
2022-09-29[MLGO] ML Regalloc Priority AdvisorEric Wang1-0/+1
The bulk of the implementation is common between 'release' mode (==AOT-ed model) and 'development' mode (for training), the main difference is that in development mode, we may also log features (for training logs), inject scoring information and then produce the log file. Differential Revision: https://reviews.llvm.org/D133616
2022-09-23Make MLIR model URLs cache variablesYi Kong1-1/+1
This allows us to directly use the models published on Github. Differential Revision: https://reviews.llvm.org/D134566
2022-09-22-dot-machine-cfg for printing MachineFunction to a dot fileChristudasan Devadasan1-0/+1
This pass allows a user to dump a MIR function to a dot file and view it as a graph. It is targeted to provide a similar functionality as -dot-cfg pass on LLVM-IR. As of now the pass also support below flags: -dot-mcfg-only [optional][won't print instructions in the graph just block name] -mcfg-dot-filename-prefix [optional][prefix to add to output dot file] -mcfg-func-name [optional] [specify function name or it's substring, handy if mir file contains multiple functions and you need to see graph of just one] More flags and details can be introduced as per the requirements in future. This pass is inspired from -dot-cfg IR pass and APIs are written in almost identical format. Patch by Yashwant Singh <Yashwant.Singh@amd.com> (yassingh) Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D133709
2022-08-26[llvm/CodeGen] Add ExpandLargeDivRem passMatthias Gehre1-0/+1
Adds a pass ExpandLargeDivRem to expand div/rem instructions with more than 128 bits into a loop computing that value. As discussed on https://reviews.llvm.org/D120327, this approach has the advantage that it is independent of the runtime library. This also helps the clang driver, which otherwise would need to understand enough about the runtime library to know whether to allow _BitInts with more than 128 bits. Targets are still free to disable this pass and instead provide a faster implementation in a runtime library. Fixes https://github.com/llvm/llvm-project/issues/44994 Differential Revision: https://reviews.llvm.org/D126644
2022-08-18[NFC][MLGO] ML Regalloc Priority AdvisorEric Wang1-0/+1
This patch introduces the priority analysis and the priority advisor, the default implementation, and the scaffolding for introducing the other implementations of the advisor. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D131220
2022-07-11[mlgo] Don't provide default model URLsMircea Trofin1-4/+1
Pointed out in Issue #56432: the current reference models may not be quite friendly to open source projects. Their purpose is only illustrative - the expectation is that projects would train their own. To avoid unintentionally pulling such a model, made the URL cmake setting require explicit user setting. Differential Revision: https://reviews.llvm.org/D129342
2022-05-26Reland "[Propeller] Promote functions with propeller profiles to .text.hot."Rahman Lavaee1-0/+1
This relands commit 4d8d2580c53e130c3c3dd3877384301e3c495554. The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests. Differential Revision: https://reviews.llvm.org/D126518