rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
37 hours	[Instrumentation] Move out to Utils (NFC) (#108532)	Antonio Frighetto	1	-1/+1
	Utility functions have been moved out to Utils. Minor opportunity to drop the header where not needed.
8 days	[LLVM][Coroutines] Transform "coro_elide_safe" calls to switch ABI ↵	Yuxuan Chen	1	-0/+1
	coroutines to the `noalloc` variant (#99285) This patch is episode three of the middle end implementation for the coroutine HALO improvement project published on discourse: https://discourse.llvm.org/t/language-extension-for-better-more-deterministic-halo-for-c-coroutines/80044 After we attribute the calls to some coroutines as "coro_elide_safe" in the C++ FE and creating a `noalloc` ramp function, we use a new middle end pass to move the call to coroutines to the noalloc variant. This pass should be run after CoroSplit. For each node we process in CoroSplit, we look for its callers and replace the attributed ones in presplit coroutines to the noalloc one. The transformed `noalloc` ramp function will also require a frame pointer to a block of memory it can use as an activation frame. We allocate this on the caller's frame with an alloca. Please note that we cannot safely transform such attributed calls in post-split coroutines due to memory lifetime reasons. The CoroSplit pass is responsible for creating the coroutine frame spills for all the allocas in the coroutine. Therefore it will be unsafe to create new allocas like this one in post-split coroutines. This happens relatively rarely because CGSCC performs the passes on the callees before the caller. However, if multiple coroutines coexist in one SCC, this situation does happen (and prevents us from having potentially unbound frame size due to recursion.) You can find episode 1: Clang FE of this patch series at https://github.com/llvm/llvm-project/pull/99282 Episode 2: CoroSplit at https://github.com/llvm/llvm-project/pull/99283
11 days	[NFCI]Remove EntryCount from FunctionSummary and clean up surrounding ↵	Mingming Liu	1	-1/+0
	synthetic count passes. (#107471) The primary motivation is to remove `EntryCount` from `FunctionSummary`. This frees 8 bytes out of `sizeof(FunctionSummary)` (136 bytes as of https://github.com/llvm/llvm-project/commit/64498c54831bed9cf069e0923b9b73678c6451d8). While I'm at it, this PR clean up {SummaryBasedOptimizations, SyntheticCountsPropagation} since they were not used and there are no plans to further invest on them. With this patch, bitcode writer writes a placeholder 0 at the byte offset of `EntryCount` and bitcode reader can parse the function entry count at the correct byte offset. Added a TODO to stop writing `EntryCount` and bump bitcode version
11 days	[ctx_prof] Flattened profile lowering pass (#107329)	Mircea Trofin	1	-0/+1
	Pass to flatten and lower the contextual profile to profile (i.e. `MD_prof`) metadata. This is expected to be used after all IPO transformations have happened. Prior to lowering, the instrumentation is maintained during IPO and the contextual profile is kept in sync (see PRs #105469, #106154). Flattening (#104539) sums up all the counters belonging to all a function's context nodes. We first propagate counter values (from the flattened profile) using the same propagation algorithm as `PGOUseFunc::populateCounters`, then map the edge values to `branch_weights`. Functions. in the module that don't have an entry in the flattened profile are deemed cold, and any `MD_prof` metadata they may have is reset. The profile summary is also reset at this point. Issue [#89287](https://github.com/llvm/llvm-project/issues/89287)
12 days	[SandboxVec] Boilerplate (#107431)	vporpo	1	-0/+1
	This patch implements the new pass and registers it with the pass manager. For context, this is a vectorizer that operates on Sandbox IR, which is a transactional IR on top of LLVM IR.
13 days	[CodeGen][NewPM] Port MachineCSE pass to new pass manager. (#106605)	Christudasan Devadasan	1	-0/+1

2024-09-02	[InstCombine] Remove optional LoopInfo dependency	Nikita Popov	1	-3/+1
	https://github.com/llvm/llvm-project/pull/106075 has removed the last dependency on LoopInfo in InstCombine, so don't fetch the analysis anymore and remove the use-loop-info pass option.
2024-08-29	[X86,SimplifyCFG] Support hoisting load/store with conditional faulting ↵	Shengchen Kan	1	-0/+2
	(Part I) (#96878) This is simplifycfg part of https://github.com/llvm/llvm-project/pull/95515 In this PR, we support hoisting load/store with conditional faulting in `SimplifyCFGOpt::speculativelyExecuteBB` to eliminate conditional branches. This is for cases like ``` void test (int a, int b) { if (a) b = a; } ``` In the following patches, we will support the hoist in `SimplifyCFGOpt::hoistCommonCodeFromSuccessors`. That is for cases like ``` void test (int a, int c, int d) { if (a) c = a; else d = a; } ```
2024-08-17	[LSR] Split the -lsr-term-fold transformation into it's own pass (#104234)	Philip Reames	1	-0/+1
	This transformation doesn't actually use any of the internal state of LSR and recomputes all information from SCEV. Splitting it out makes it easier to test. Note that long term I would like to write a version of this transform which is integrated with LSR's solver, but if that happens, we'll just delete the extra pass. Integration wise, I switched from using TTI to using a pass configuration variable. This seems slightly more idiomatic, and means we don't run the extra logic on any target other than RISCV.
2024-08-15	[nfc][ctx_prof] Remove the need for `PassBuilder` to know about ↵	Mircea Trofin	1	-2/+0
	`UseCtxProfile` (#104492)
2024-08-15	[DXIL][Analysis] Boilerplate for DXILResourceAnalysis pass	Justin Bogner	1	-0/+1
	Broke this out into its own commit to make the next one easier to review. Pull Request: https://github.com/llvm/llvm-project/pull/100700
2024-08-12	[DXIL][Analysis] Add DXILMetadataAnalysis pass (#102079)	S. Bharadwaj Yadavalli	1	-0/+1
	DXIL Metadata Analysis passes (one for legacy PM and one for new PM) that collect following DXIL module metadata information in a structure are added. 1. Shader Model version 2. DXIL version 3. Shader Stage Information collected using the legacy pass is verified by adding additional test commands to existing metadata test sources.
2024-08-12	StructurizeCFG: Add SkipUniformRegions pass parameter to new PM version ↵	Matt Arsenault	1	-0/+5
	(#102812) Keep respecting the old cl::opt for now.
2024-08-08	[LLVM][rtsan] Add RealtimeSanitizer transform pass (#101232)	Chris Apple	1	-0/+6
	Split from #100596. Introduce the RealtimeSanitizer transform, which inserts the rtsan_enter/exit functions at the appropriate places in an instrumented function.
2024-08-07	[ctx_prof] CtxProfAnalysis (#102084)	Mircea Trofin	1	-0/+3
	This is an immutable analysis that loads and makes the contextual profile available to other passes. This patch introduces the analysis and an analysis printer pass. Subsequent patches will introduce the APIs that IPO passes will call to modify the profile as result of their changes.
2024-07-23	[SimplifyCFG] Increase budget for FoldTwoEntryPHINode() if the branch is ↵	Tianqing Wang	1	-0/+2
	unpredictable. (#98495) The `!unpredictable` metadata has been present for a long time, but it's usage in optimizations is still limited. This patch teaches `FoldTwoEntryPHINode()` to be more aggressive with an unpredictable branch to reduce mispredictions. A TTI interface `getBranchMispredictPenalty()` is added to distinguish between different hardwares to ensure we don't go too far for simpler cores. For simplicity, only a naive x86 implementation is included for the time being.
2024-07-22	[CodeGen] change prototype of regalloc filter function (#93525)	Christudasan Devadasan	1	-2/+2
	[CodeGen] Change the prototype of regalloc filter function Change the prototype of the filter function so that we can filter not just by RegClass. We need to implement more complicated filter based upon some other info associated with each register. Patch provided by: Gang Chen (gangc@amd.com)
2024-07-17	[CodeGen][NewPM] Port `phi-node-elimination` to new pass manager (#98867)	paperchalice	1	-0/+1
	- Add `PHIEliminationPass `. - Support new pass manager in `MachineBasicBlock:: SplitCriticalEdge `
2024-07-15	[CodeGen] Port `two-address-instructions` to new pass manager (#98632)	paperchalice	1	-0/+1
	Add `TwoAddressInstructionPass`.
2024-07-15	[CodeGen][NewPM] Port `MachineVerifier` to new pass manager (#98628)	paperchalice	1	-0/+1
	- Add `MachineVerifierPass`. - Use complete `MachineVerifierPass` in `VerifyInstrumentation` if possible. `LiveStacksAnalysis` will be added in future, all other analyses are done.
2024-07-14	[CodeGen][NewPM] Add `MachineOptimizationRemarkEmitterAnalysis` (#98601)	paperchalice	1	-0/+1
	Add `MachineOptimizationRemarkEmitterAnalysis` the legacy version `MachineOptimizationRemarkEmitterPass` is already a wrapper.
2024-07-12	[CodeGen][NewPM] Port `machine-block-freq` to new pass manager (#98317)	paperchalice	1	-0/+1
	- Add `MachineBlockFrequencyAnalysis`. - Add `MachineBlockFrequencyPrinterPass`. - Use `MachineBlockFrequencyInfoWrapperPass` in legacy pass manager. - `LazyMachineBlockFrequencyInfo::print` is empty, drop it due to new pass manager migration.
2024-07-10	[CodeGen][NewPM] Port `LiveIntervals` to new pass manager (#98118)	paperchalice	1	-0/+1
	- Add `LiveIntervalsAnalysis`. - Add `LiveIntervalsPrinterPass`. - Use `LiveIntervalsWrapperPass` in legacy pass manager. - Use `std::unique_ptr` instead of raw pointer for `LICalc`, so destructor and default move constructor can handle it correctly. This would be the last analysis required by `PHIElimination`.
2024-07-09	[CodeGen][NewPM] Port `SlotIndexes` to new pass manager (#97941)	paperchalice	1	-0/+1
	- Add `SlotIndexesAnalysis`. - Add `SlotIndexesPrinterPass`. - Use `SlotIndexesWrapperPass` in legacy pass.
2024-07-09	[CodeGen][NewPM] Port `LiveVariables` to new pass manager (#97880)	paperchalice	1	-0/+1
	- Port `LiveVariables` to new pass manager. - Convert to `LiveVariablesWrapperPass` in legacy pass manager.
2024-07-09	[CodeGen][NewPM] Port `machine-loops` to new pass manager (#97793)	paperchalice	1	-0/+1
	- Add `MachineLoopAnalysis`. - Add `MachineLoopPrinterPass`. - Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
2024-06-29	Reapply "[LLVM][Instrumentation] Add numerical sanitizer (#85916)"	Alexander Shaposhnikov	1	-0/+1
	This reverts commit 493c384a7d94cce1d18824a6b0e1f9ee20cdc681 and includes a fix for the build breakage.
2024-06-29	Revert "[LLVM][Instrumentation] Add numerical sanitizer (#85916)"	Alexander Shaposhnikov	1	-1/+0
	This reverts commit 15ad7919f6dd18b5d7f5a22daad6a5c25ecb8793. The commit broke the build bot https://lab.llvm.org/buildbot/#/builders/11/builds/822.
2024-06-28	[LLVM][Instrumentation] Add numerical sanitizer (#85916)	Alexander Shaposhnikov	1	-0/+1
	This PR introduces the numerical sanitizer originally proposed by Clement Courbet on https://reviews.llvm.org/D97854 (https://arxiv.org/abs/2102.12782). The main additions include: - Migration to LLVM opaque pointers - Migration to various updated APIs - Extended coverage for LLVM instructions/intrinsics - Code refactoring The tool is still very experimental, the coverage (e.g. for intrinsics / library functions) is incomplete. Link: https://discourse.llvm.org/t/rfc-revival-of-numerical-sanitizer/79601 --------- Co-authored-by: Fangrui Song <i@maskray.me>
2024-06-28	Reapply "[CodeGen][NewPM] Port machine-branch-prob to new pass manager" ↵	paperchalice	1	-0/+1
	(#96858) (#96869) This reverts commit ab58b6d58edf6a7c8881044fc716ca435d7a0156. In `CodeGen/Generic/MachineBranchProb.ll`, `llc` crashed with dumped MIR when targeting PowerPC. Move test to `llc/new-pm`, which is X86 specific.
2024-06-27	[PassManager] Add pretty stack frames (#96078)	Nikita Popov	1	-3/+13
	In NewPM pass managers, add a "pretty stack frame" that tells you which pass crashed while running which function. For example `opt -O3 -passes-ep-peephole=trigger-crash-function test.ll` will print something like this: ``` Stack dump: 0. Program arguments: build/bin/opt -S -O3 -passes-ep-peephole=trigger-crash-function test.ll 1. Running pass "function<eager-inv>(mem2reg,instcombine<max-iterations=1;no-use-loop-info;no-verify-fixpoint>,trigger-crash-function,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts;speculate-blocks;simplify-cond-branch>)" on module "test.ll" 2. Running pass "trigger-crash-function" on function "fshl_concat_i8_i8" ``` While the crashing pass is usually evident from the stack trace, this also shows which function triggered the crash, as well as the pipeline string for the pass (including options). Similar functionality existed in the LegacyPM.
2024-06-27	Revert "[CodeGen][NewPM] Port machine-branch-prob to new pass manager" (#96858)	paperchalice	1	-1/+0
	Reverts llvm/llvm-project#96389 Some ppc bots failed.
2024-06-27	[CodeGen][NewPM] Port machine-branch-prob to new pass manager (#96389)	paperchalice	1	-0/+1
	Like IR version `print<branch-prob>`, there is also a `print<machine-branch-prob>`.
2024-06-26	[PassBuilder] Parse machine function analyses inside require/invalidate (#96634)	paperchalice	1	-0/+12
	Now we have several machine function analyses but forgot to support them in `parseMachinePass`.
2024-06-25	[CodeGen][NewPM] Port machine post dominator tree analysis to new pass ↵	paperchalice	1	-0/+1
	manager (#96378) Follows #95879.
2024-06-24	Reapply [IR] Lazily initialize the class to pass name mapping (NFC) (#96321) ↵	Nikita Popov	1	-15/+7
	(#96462) On MSVC the `this` uses inside `decltype` require a lambda capture. On clang they result in an unused capture warning instead. Add the capture and suppress the warning with `(void)this`. ----- Initializing this map is somewhat expensive (especially for O0), so we currently only do it if certain flags are used. I would like to make use of it for crash dumps (#96078), where we don't know in advance whether it will be needed or not. This patch changes the initialization to a lazy approach, where a callback is registered that does the actual initialization. The callbacks will be run the first time the pass name is requested. This way there is no compile-time impact if the mapping is not used.
2024-06-24	Revert "[IR] Lazily initialize the class to pass name mapping (NFC) (#96321)"	Nikita Popov	1	-4/+15
	My attempt to fix the Windows build made things worse, revert entirely for now. This reverts commit e7137f2fed5cfee822ae3c4c6d39188adb59a16c. This reverts commit 6eaf204dbb0a6a81cddfd02f625c130f7bb1aae5. This reverts commit 957dc4366dd2ce9d5d2991c3ad76bbf438e9954e.
2024-06-24	[IR] Lazily initialize the class to pass name mapping (NFC) (#96321)	Nikita Popov	1	-15/+4
	Initializing this map is somewhat expensive (especially for O0), so we currently only do it if certain flags are used. I would like to make use of it for crash dumps (#96078), where we don't know in advance whether it will be needed or not. This patch changes the initialization to a lazy approach, where a callback is registered that does the actual initialization. The callbacks will be run the first time the pass name is requested. This way there is no compile-time impact if the mapping is not used.
2024-06-22	[CodeGen][NewPM] Extract MachineFunctionProperties modification part to an ↵	paperchalice	1	-1/+2
	RAII class (#94854) Modify MachineFunctionProperties in PassModel makes `PassT P; P.run(...);` not work properly. This is a necessary compromise.
2024-06-22	[CodeGen][NewPM] Port machine dominator tree analysis to new pass manager ↵	paperchalice	1	-0/+1
	(#95879) - Add `MachineDominatorTreeAnalysis` - Add `MachineDominatorTreePrinterPass` There is no test for this analysis in codebase. Also, the pass name is renamed to `machine-dom-tree` instead of `machinedomtree`.
2024-06-21	[RegAlloc] Don't call always-true ShouldAllocClass (#96296)	Alexis Engelke	1	-5/+7
	Previously, there was at least one virtual function call for every allocated register. The only users of this feature are AMDGPU and RISC-V (RVV), other targets don't use this. To easily identify these cases, change the default functor to nullptr and don't call it for every allocated register.
2024-06-20	[IR] Remove RepeatedPass (#96211)	Nikita Popov	1	-59/+0
	This pass is not used in any pipeline, barely used in tests and not really useful, so drop it. The only place where we "repeat" passes is devirt repetition, and that is done using a separate pass.
2024-06-07	[AArch64][LoopIdiom] Generalize AArch64LoopIdiomTransform into ↵	Min-Yih Hsu	1	-0/+1
	LoopIdiomVectorize (#94081) To facilitate sharing LoopIdiomTransform between AArch64 and RISC-V, this first patch moves AArch64LoopIdiomTransform from lib/Target/AArch64 to lib/Transforms/Vectorize and renames it to LoopIdiomVectorize. The following patch (#94082) will teach LoopIdiomVectorize how to generate VP intrinsics (in addition to the current masked vector style) in favor of RVV.
2024-06-07	[NewPM][CodeGen] Port `regallocfast` to new pass manager (#94426)	paperchalice	1	-0/+56
	This pull request port `regallocfast` to new pass manager. It exposes the parameter `filter` to handle different register classes for AMDGPU. IIUC AMDGPU need to allocate different register classes separately so it need implement its own `--<reg-class>-regalloc`. Now users can use e.g. `-passe=regallocfast<filter=sgpr>` to allocate specific register class. The command line option `--regalloc-npm` is still in work progress, plan to reuse the syntax of passes, e.g. use `--regalloc-npm=regallocfast<filter=sgpr>,greedy<filter=vgpr>` to replace `--sgpr-regalloc` and `--vgpr-regalloc`.
2024-06-06	[AMDGPU] Implement variadic functions by IR lowering (#93362)	Jon Chesterfield	1	-0/+1
	This is a mostly-target-independent variadic function optimisation and lowering pass. It is only enabled for AMDGPU in this initial commit. The purpose is to make C style variadic functions a zero cost abstraction. They are lowered to equivalent IR which is then amenable to other optimisations. This is inherently slightly target specific but much less so than one might expect - the C varargs interface heavily constrains the ABI design divergence. The pass is primarily tested from webassembly. This is because wasm has a straightforward variadic lowering strategy which coincides exactly with what this pass transforms code into and a struct passing convention with few cases to check. Adding further targets conventions is straightforward and elided from this patch primarily to simplify the review. Implemented in other branches are Linux X86, AMD64, AArch64 and NVPTX. Testing for targets that have existing lowering for va_arg from clang is most efficiently done by checking that clang \| opt completely elides the variadic syntax from test cases. The lowering produces a struct for each call site which can be inspected to check the various alignment and indirections are correct. AMDGPU presently has no variadic support other than some ad hoc printf handling. Combined with the pass being inactive on all other targets landing this represents strict increase in capability with zero risk. Testing and refining will continue post commit. In addition to the compiler tests included here, a self contained x64 clang/musl toolchain was constructed using the "lowering" instead of the systemv ABI and used to build various C programs like lua and libxml2.
2024-06-05	[NewPM][CodeGen] Port `localstackalloc` to new pass manager (#94303)	paperchalice	1	-0/+1
	There are two AArch64 tests use `-start-before` and `-print-after`. Rest tests uses `--passes` to test this pass.
2024-06-04	[NewPM][CodeGen] Port `finalize-isel` to new pass manager (#94214)	paperchalice	1	-0/+1
	It should preserve more analysis results, but it happens immediately after instruction selection.
2024-05-16	[NewPM] Add pass options for InternalizePass to preserve GVs (reland) (#92383)	Tim Besard	1	-0/+18
	Reland of https://github.com/llvm/llvm-project/pull/91334, which broke the gcc7 buildbot and was reverted in https://github.com/llvm/llvm-project/pull/92321. Work around the failure by being explicit about returning an `Expected`. cc @joker-eph
2024-05-15	Revert "[NewPM] Add pass options for `InternalizePass` to preserve GVs." ↵	Mehdi Amini	1	-18/+0
	(#92321) Reverts llvm/llvm-project#91334 This broke the gcc7 build. I suspect the issue is a mismatch on user-defined move constructor on the return: `return PreservedGVs;` does not match the return type of the function.
2024-05-15	[NewPM] Add pass options for `InternalizePass` to preserve GVs. (#91334)	Tim Besard	1	-0/+18
	This PR adds a string interface to `InternalizePass`' `MustPreserveGV` option, which is a callback function to indicate if a GV is not to be internalized. This is for use in LLVM.jl, the Julia wrapper for LLVM, which uses the C API and is thus required to use the PassBuilder string API for building NewPM pipelines.