riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-04-22	[RemoveDIs] Make verify-uselistorder preserve the input debug info format ↵	Stephen Tozer	1	-0/+4
	(#87789) Verify-uselistorder wants to take some input IR and verify that the uselist order is stable after roundtripping to bitcode and assembly. This is disrupted if the file is converted between the new and old debug info formats after parsing - while there's no functional difference, the change to the in-memory representation of the IR modifies the uselist. This patch changes verify-uselistorder to not convert input files between debug info formats by default, preventing changes from being made to the file being checked. In addition, this patch makes it so that when we _do_ print IR in the new debug info format to bitcode or assembly, we delete any lingering debug intrinsic declarations, ensuring that we don't write uselist entries for them.
2024-04-10	[ThinLTO]Record import type in GlobalValueSummary::GVFlags (#87597)	Mingming Liu	1	-0/+2
	The motivating use case is to support import the function declaration across modules to construct call graph edges for indirect calls [1] when importing the function definition costs too much compile time (e.g., the function is too large has no `noinline` attribute). 1. Currently, when the compiled IR module doesn't have a function definition but its postlink combined summary contains the function summary or a global alias summary with this function as aliasee, the function definition will be imported from source module by IRMover. The implementation is in FunctionImporter::importFunctions [2] 2. In order for FunctionImporter to import a declaration of a function, both function summary and alias summary need to carry the def / decl state. Specifically, all existing summary fields doesn't differ across import modules, but the def / decl state of is decided by `<ImportModule, Function>`. This change encodes the def/decl state in `GlobalValueSummary::GVFlags`. In the subsequent changes 1. The indexing step `computeImportForModule` [3] will compute the set of definitions and the set of declarations for each module, and passing on the information to bitcode writer. 2. Bitcode writer will look up the def/decl state and sets the state when it writes out the flag value. This is demonstrated in https://github.com/llvm/llvm-project/pull/87600 3. Function importer will read the def/decl state when reading the combined summary to figure out two sets of global values, and IRMover will be updated to import the declaration (aka linkGlobalValuePrototype [4]) into the destination module. - The next change is https://github.com/llvm/llvm-project/pull/87600 [1] mentioned in rfc https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5 [2] https://github.com/llvm/llvm-project/blob/3b337242ee165554f0017b00671381ec5b1ba855/llvm/lib/Transforms/IPO/FunctionImport.cpp#L1608-L1764 [3] https://github.com/llvm/llvm-project/blob/3b337242ee165554f0017b00671381ec5b1ba855/llvm/lib/Transforms/IPO/FunctionImport.cpp#L856 [4] https://github.com/llvm/llvm-project/blob/3b337242ee165554f0017b00671381ec5b1ba855/llvm/lib/Linker/IRMover.cpp#L605
2024-04-04	[RemoveDIs][NFC] Use ScopedDbgInfoFormatSetter in more places (#87380)	Stephen Tozer	1	-13/+4
	The class `ScopedDbgInfoFormatSetter` was added as a convenient way to temporarily change the debug info format of a function or module, as part of IR printing; since this process is repeated in a number of other places, this patch uses the format-setter class in those places as well.
2024-04-01	[ThinLTO][TypeProf] Implement vtable def import (#79381)	Mingming Liu	1	-2/+11
	Add annotated vtable GUID as referenced variables in per function summary, and update bitcode writer to create value-ids for these referenced vtables. - This is the part3 of type profiling work, and described in the "Virtual Table Definition Import" [1] section of the RFC. [1] https://github.com/llvm/llvm-project/pull/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
2024-03-29	[IR] Add nowrap flags for trunc instruction (#85592)	elhewaty	1	-0/+5
	This patch adds the nuw (no unsigned wrap) and nsw (no signed wrap) poison-generating flags to the trunc instruction. Discourse thread: https://discourse.llvm.org/t/rfc-add-nowrap-flags-to-trunc/77453
2024-03-20	[RemoveDIs][NFC] Rename DPMarker->DbgMarker (#85931)	Stephen Tozer	1	-1/+1
	Another trivial rename patch, the last big one for now, which renamed DPMarkers to DbgMarkers. This required the field `DbgMarker` in `Instruction` to be renamed to `DebugMarker` to avoid a clash, but otherwise was a simple string substitution of `s/DPMarker/DbgMarker` and a manual renaming of `DPM` to `DM` in the few places where that acronym was used for debug markers.
2024-03-20	[RemoveDIs][NFC] Rename DPLabel->DbgLabelRecord (#85918)	Stephen Tozer	2	-6/+6
	This patch renames DPLabel to DbgLabelRecord, in accordance with the ongoing DbgRecord rename. This rename was fairly trivial, since DPLabel isn't as widely used as DPValue and has no real conflicts in either its full or abbreviated name. As usual, the entire replacement was done automatically, with `s/DPLabel/DbgLabelRecord/` and `s/DPL/DLR/`.
2024-03-20	[IR] Change representation of getelementptr inrange (#84341)	Nikita Popov	1	-3/+4
	As part of the migration to ptradd (https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699), we need to change the representation of the `inrange` attribute, which is used for vtable splitting. Currently, inrange is specified as follows: ``` getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2) ``` The `inrange` is placed on a GEP index, and all accesses must be "in range" of that index. The new representation is as follows: ``` getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2) ``` This specifies which offsets are "in range" of the GEP result. The new representation will continue working when canonicalizing to ptradd representation: ``` getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48) ``` The inrange offsets are relative to the return value of the GEP. An alternative design could make them relative to the source pointer instead. The result-relative format was chosen on the off-chance that we want to extend support to non-constant GEPs in the future, in which case this variant is more expressive. This implementation "upgrades" the old inrange representation in bitcode by simply dropping it. This is a very niche feature, and I don't think trying to upgrade it is worthwhile. Let me know if you disagree.
2024-03-19	[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)	Stephen Tozer	2	-37/+40
	This is the major rename patch that prior patches have built towards. The DPValue class is being renamed to DbgVariableRecord, which reflects the updated terminology for the "final" implementation of the RemoveDI feature. This is a pure string substitution + clang-format patch. The only manual component of this patch was determining where to perform these string substitutions: `DPValue` and `DPV` are almost exclusively used for DbgRecords, except for: - llvm/lib/target, where 'DP' is used to mean double-precision, and so appears as part of .td files and in variable names. NB: There is a single existing use of `DPValue` here that refers to debug info, which I've manually updated. - llvm/tools/gold, where 'LDPV' is used as a prefix for symbol visibility enums. Outside of these places, I've applied several basic string substitutions, with the intent that they only affect DbgRecord-related identifiers; I've checked them as I went through to verify this, with reasonable confidence that there are no unintended changes that slipped through the cracks. The substitutions applied are all case-sensitive, and are applied in the order shown: ``` DPValue -> DbgVariableRecord DPVal -> DbgVarRec DPV -> DVR ``` Following the previous rename patches, it should be the case that there are no instances of any of these strings that are meant to refer to the general case of DbgRecords, or anything other than the DPValue class. The idea behind this patch is therefore that pure string substitution is correct in all cases as long as these assumptions hold.
2024-03-19	[Dwarf] Support `__ptrauth` qualifier in metadata nodes (#83862)	Daniil Kovalev	1	-0/+5
	Reland #82363 after fixing build failure https://lab.llvm.org/buildbot/#/builders/5/builds/41428. Memory sanitizer detects usage of `RawData` union member which is not filled directly. Instead, the code relies on filling `Data` union member, which is a struct consisting of signing schema parameters. According to https://en.cppreference.com/w/cpp/language/union, this is UB: "It is undefined behavior to read from the member of the union that wasn't most recently written". Instead of relying on compiler allowing us to do dirty things, do not use union and only store `RawData`. Particular ptrauth parameters are obtained on demand via bit operations. Original PR description below. Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type which has the qualifier applied, and the following parameters representing the signing schema: - `ptrAuthKey` (integer) - `ptrAuthIsAddressDiscriminated` (boolean) - `ptrAuthExtraDiscriminator` (integer) - `ptrAuthIsaPointer` (boolean) - `ptrAuthAuthenticatesNullValues` (boolean) Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-03-15	Reapply [RemoveDIs] Read/write DbgRecords directly from/to bitcode (#83251)	Orlando Cazalet-Hyams	3	-77/+212
	Reaplying after revert in #85382 (861ebe6446296c96578807363aa292c69d827773). Fixed intermittent test failure by avoiding piping output in some RUN lines. If --write-experimental-debuginfo-iterators-to-bitcode is true (default false) and --expermental-debuginfo-iterators is also true then the new debug info format (non-instruction records) is written to bitcode directly. Added the following records: FUNC_CODE_DEBUG_RECORD_LABEL FUNC_CODE_DEBUG_RECORD_VALUE FUNC_CODE_DEBUG_RECORD_DECLARE FUNC_CODE_DEBUG_RECORD_ASSIGN FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE The last one has an abbrev in FUNCTION_BLOCK BLOCK_INFO. Incidentally, this uses the last value available without widening the code-length for FUNCTION_BLOCK from 4 to 5 bits. Records are formatted as follows: All DbgRecord start with: 1. DILocation FUNC_CODE_DEBUG_RECORD_LABEL 2. DILabel DPValues then share common fields: 2. DILocalVariable 3. DIExpression FUNC_CODE_DEBUG_RECORD_VALUE 4. Location Metadata FUNC_CODE_DEBUG_RECORD_DECLARE 4. Location Metadata FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE 4. Location Value (single) FUNC_CODE_DEBUG_RECORD_ASSIGN 4. Location Metadata 5. DIAssignID 6. DIExpression (address) 7. Location Metadata (address) Encoding the DILocation metadata reference directly appeared to yield smaller bitcode files than encoding the operands seperately (as is done with instruction DILocations). FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE is by far the most common DbgRecord record in optimized code (order of 5x-10x over other kinds). Unoptimized code should only contain FUNC_CODE_DEBUG_RECORD_DECLARE.
2024-03-15	Revert "[RemoveDIs] Read/write DbgRecords directly from/to bitcode" (#85382)	Orlando Cazalet-Hyams	3	-212/+77
	Reverts llvm/llvm-project#83251 Buildbot: https://lab.llvm.org/buildbot/#/builders/139/builds/61485
2024-03-15	[RemoveDIs] Read/write DbgRecords directly from/to bitcode (#83251)	Orlando Cazalet-Hyams	3	-77/+212
	If --write-experimental-debuginfo-iterators-to-bitcode is true (default false) and --expermental-debuginfo-iterators is also true then the new debug info format (non-instruction records) is written to bitcode directly. Added the following records: FUNC_CODE_DEBUG_RECORD_LABEL FUNC_CODE_DEBUG_RECORD_VALUE FUNC_CODE_DEBUG_RECORD_DECLARE FUNC_CODE_DEBUG_RECORD_ASSIGN FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE The last one has an abbrev in FUNCTION_BLOCK BLOCK_INFO. Incidentally, this uses the last value available without widening the code-length for FUNCTION_BLOCK from 4 to 5 bits. Records are formatted as follows: All DbgRecord start with: 1. DILocation FUNC_CODE_DEBUG_RECORD_LABEL 2. DILabel DPValues then share common fields: 2. DILocalVariable 3. DIExpression FUNC_CODE_DEBUG_RECORD_VALUE 4. Location Metadata FUNC_CODE_DEBUG_RECORD_DECLARE 4. Location Metadata FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE 4. Location Value (single) FUNC_CODE_DEBUG_RECORD_ASSIGN 4. Location Metadata 5. DIAssignID 6. DIExpression (address) 7. Location Metadata (address) Encoding the DILocation metadata reference directly appeared to yield smaller bitcode files than encoding the operands seperately (as is done with instruction DILocations). FUNC_CODE_DEBUG_RECORD_VALUE_SIMPLE is by far the most common DbgRecord record in optimized code (order of 5x-10x over other kinds). Unoptimized code should only contain FUNC_CODE_DEBUG_RECORD_DECLARE.
2024-03-13	[RemoveDI][NFC] Rename DPValue->DbgRecord in comments and varnames (#84939)	Stephen Tozer	1	-3/+3
	This patch continues the ongoing rename work, replacing DPValue with DbgRecord in comments and the names of variables, both members and fn-local. This is the most labour-intensive part of the rename, as it is where the most decisions have to be made about whether a given comment or variable is referring to DPValues (equivalent to debug variable intrinsics) or DbgRecords (a catch-all for all debug intrinsics); these decisions are not individually difficult, but comprise a fairly large amount of text to review. This patch still largely performs basic string substitutions followed by clang-format; there are almost* no places where, for example, a comment has been expanded or modified to reflect the semantic difference between DPValues and DbgRecords. I don't believe such a change is generally necessary in LLVM, but it may be useful in the docs, and so I'll be submitting docs changes as a separate patch. *In a few places, `dbg.values` was replaced with `debug intrinsics`.
2024-03-09	Reapply [IR] Add new Range attribute using new ConstantRange Attribute type ↵	Andreas Jonson	1	-20/+41
	(#84617) The only change from https://github.com/llvm/llvm-project/pull/83171 is the change of the allocator so the destructor is called for ConstantRangeAttributeImpl. reverts https://github.com/llvm/llvm-project/pull/84549
2024-03-08	Revert "[IR] Add new Range attribute using new ConstantRange Attribute type" ↵	Florian Mayer	1	-41/+20
	(#84549) Reverts llvm/llvm-project#83171 broke sanitizer buildbot https://lab.llvm.org/buildbot/#/builders/168/builds/19110/steps/10/logs/stdio
2024-03-08	[IR] Add new Range attribute using new ConstantRange Attribute type (#83171)	Andreas Jonson	1	-20/+41
	implementation as discussed in https://discourse.llvm.org/t/rfc-metadata-attachments-for-function-arguments/76420
2024-03-02	Revert "[Dwarf] Support `__ptrauth` qualifier in metadata nodes" (#83672)	Daniil Kovalev	1	-5/+0
	Reverts llvm/llvm-project#82363 See a build failure related to an issue discovered by memory sanitizer (use of uninitialized value): https://lab.llvm.org/buildbot/#/builders/37/builds/31965
2024-03-01	[Dwarf] Support `__ptrauth` qualifier in metadata nodes (#82363)	Daniil Kovalev	1	-0/+5
	Emit `__ptrauth`-qualified types as `DIDerivedType` metadata nodes in IR with tag `DW_TAG_LLVM_ptrauth_type`, baseType referring to the type which has the qualifier applied, and the following parameters representing the signing schema: - `ptrAuthKey` (integer) - `ptrAuthIsAddressDiscriminated` (boolean) - `ptrAuthExtraDiscriminator` (integer) - `ptrAuthIsaPointer` (boolean) - `ptrAuthAuthenticatesNullValues` (boolean) Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
2024-02-22	[LLVM][IR] Add native vector support to ConstantInt & ConstantFP. (#74502)	Paul Walker	1	-1/+1
	NOTE: For brevity the following talks about ConstantInt but everything extends to cover ConstantFP as well. Whilst ConstantInt::get() supports the creation of vectors whereby each lane has the same value, it achieves this via other constants: * ConstantVector for fixed-length vectors * ConstantExprs for scalable vectors However, ConstantExprs are being deprecated and ConstantVector is not space efficient for larger vector types. By extending ConstantInt we can represent vector splats by only storing the underlying scalar value. More specifically: * ConstantInt gains an ElementCount variant of get(). * LLVMContext is extended to map <EC,APInt>->ConstantInt. * BitcodeReader/Writer support is extended to allow vector types. Whilst this patch adds the base support, more work is required before it's production ready. For example, there's likely to be many places where isa<ConstantInt> assumes a scalar type. Accordingly the default behaviour of ConstantInt::get() remains unchanged but a set of flags are added to allow wider testing and thus help with the migration: --use-constant-int-for-fixed-length-splat --use-constant-fp-for-fixed-length-splat --use-constant-int-for-scalable-splat --use-constant-fp-for-scalable-splat NOTE: No change is required to the bitcode format because types and values are handled separately. NOTE: For similar reasons as above, code generation doesn't work out-the-box.
2024-02-04	[Bitcode] Use range-based for loops (NFC)	Kazu Hirata	1	-2/+2

2024-01-23	[BitcodeWriter] Remove ThinLTO-specific bits from legacy pass	Arthur Eubanks	1	-18/+6
	Since we shouldn't be doing any LTO logic from the legacy pass manager.
2024-01-13	[llvm] Use range-based for loops with llvm::drop_begin (NFC)	Kazu Hirata	1	-2/+2

2024-01-11	[MemProf] Handle missing tail call frames (#75823)	Teresa Johnson	1	-1/+16
	If tail call optimization was not disabled for the profiled binary, the call contexts will be missing frames for tail calls. Handle this by performing a limited search through tail call edges for the profiled callee when a discontinuity is detected. The search depth is adjustable but defaults to 5. If we are able to identify a short sequence of tail calls, update the graph for those calls. In the case of ThinLTO, synthesize the necessary CallsiteInfos for carrying the cloning information to the backends.
2023-12-14	[IR] Add dead_on_unwind attribute (#74289)	Nikita Popov	1	-0/+2
	Add the `dead_on_unwind` attribute, which states that the caller will not read from this argument if the call unwinds. This allows eliding stores that could otherwise be visible on the unwind path, for example: ``` declare void @may_unwind() define void @src(ptr noalias dead_on_unwind %out) { store i32 0, ptr %out call void @may_unwind() store i32 1, ptr %out ret void } define void @tgt(ptr noalias dead_on_unwind %out) { call void @may_unwind() store i32 1, ptr %out ret void } ``` The optimization is not valid without `dead_on_unwind`, because the `i32 0` value might be read if `@may_unwind` unwinds. This attribute is primarily intended to be used on sret arguments. In fact, I previously wanted to change the semantics of sret to include this "no read after unwind" property (see D116998), but based on the feedback there it is better to keep these attributes orthogonal (sret is an ABI attribute, dead_on_unwind is an optimization attribute). This is a reboot of that change with a separate attribute.
2023-12-06	[ThinLTO] Add tail call flag to call edges in summary (#74043)	Teresa Johnson	1	-57/+44
	This adds support for a HasTailCall flag on function call edges in the ThinLTO summary. It is intended for use in aiding discovery of missing frames from tail calls in profiled call stacks for MemProf of profiled binaries that did not disable tail call elimination. A follow on change will add the use of this new flag during MemProf context disambiguation. The new flag is encoded in the bitcode along with either the hotness flag from the profile, or the relative block frequency under the -write-relbf-to-summary flag when there is no profile data. Because we now will always have some additional call edge information, I have removed the non-profile function summary record format, and we simply encode the tail call flag along with a hotness type of none when there is no profile information or relative block frequency. The change of record format and name caused most of the test case changes. I have added explicit testing of generation of the new tail call flag into the bitcode and IR assembly format as part of the changes to llvm/test/Bitcode/thinlto-function-summary-refgraph.ll. I have also added round trip testing through assembly and bitcode to llvm/test/Assembler/thinlto-summary.ll.
2023-12-05	[llvm][IR] Add per-global code model attribute (#72077)	hev	1	-2/+3
	This adds a per-global code model attribute, which can override the target's code model to access global variables. Suggested-by: Arthur Eubanks <aeubanks@google.com> Link: https://discourse.llvm.org/t/how-to-best-implement-code-model-overriding-for-certain-values/71816 Link: https://discourse.llvm.org/t/rfc-add-per-global-code-model-attribute/74944
2023-11-24	[IR] Add disjoint flag for Or instructions. (#72583)	Craig Topper	1	-0/+3
	This flag indicates that every bit is known to be zero in at least one of the inputs. This allows the Or to be treated as an Add since there is no possibility of a carry from any bit. If the flag is present and this property does not hold, the result is poison. This makes it easier to reverse the InstCombine transform that turns Add into Or. This is inspired by a comment here https://github.com/llvm/llvm-project/pull/71955#discussion_r1391614578 Discourse thread https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036
2023-11-17	Reapply "[DebugInfo] Make DIArgList inherit from Metadata and always unique"	Stephen Tozer	1	-5/+7
	This reverts commit 0fd5dc94380d5fe666dc6c603b4bb782cef743e7. The original commit removed DIArgLists from being in an MDNode map, but did not insert a new `delete` in the LLVMContextImpl destructor. This reapply adds that call to delete, preventing a memory leak.
2023-11-17	Revert "[DebugInfo] Make DIArgList inherit from Metadata and always unique" ↵	Stephen Tozer	1	-7/+5
	(#72682) Reverts llvm/llvm-project#72147 Reverted due to buildbot failure: https://lab.llvm.org/buildbot/#/builders/5/builds/38410
2023-11-17	[DebugInfo] Make DIArgList inherit from Metadata and always unique (#72147)	Stephen Tozer	1	-5/+7
	This patch changes the `DIArgList` class's inheritance from `MDNode` to `Metadata, ReplaceableMetadataImpl`, and ensures that it is always unique, i.e. a distinct DIArgList should never be produced. This should not result in any changes to IR or bitcode parsing and printing, as the format for DIArgList is unchanged, and the order in which it appears should also be identical. As a minor note, this patch also fixes a gap in the verifier, where the ValueAsMetadata operands to a DIArgList would not be visited.
2023-11-09	[DebugInfo][RemoveDIs] Add conversion utilities for new-debug-info format	Jeremy Morse	1	-0/+19
	This patch plumbs the command line --experimental-debuginfo-iterators flag in to the pass managers, so that modules can be converted to the new format, passes run, then converted back to the old format. That allows developers to test-out the new debuginfo representation across some part of LLVM with no further work, and from the command line. It also installs flag-catchers at the various points that bitcode and textual IR can egress from a process, and temporarily convert the module to dbg.value format when doing so. No tests alas as it's designed to be transparent. Differential Revision: https://reviews.llvm.org/D154372
2023-11-09	[Coroutines] Introduce [[clang::coro_only_destroy_when_complete]] (#71014)	Chuanqi Xu	1	-0/+2
	Close https://github.com/llvm/llvm-project/issues/56980. This patch tries to introduce a light-weight optimization attribute for coroutines which are guaranteed to only be destroyed after it reached the final suspend. The rationale behind the patch is simple. See the example: ```C++ A foo() { dtor d; co_await something(); dtor d1; co_await something(); dtor d2; co_return 43; } ``` Generally the generated .destroy function may be: ```C++ void foo.destroy(foo.Frame frame) { switch(frame->suspend_index()) { case 1: frame->d.~dtor(); break; case 2: frame->d.~dtor(); frame->d1.~dtor(); break; case 3: frame->d.~dtor(); frame->d1.~dtor(); frame->d2.~dtor(); break; default: // coroutine completed or haven't started break; } frame->promise.~promise_type(); delete frame; } ``` Since the compiler need to be ready for all the cases that the coroutine may be destroyed in a valid state. However, from the user's perspective, we can understand that certain coroutine types may only be destroyed after it reached to the final suspend point. And we need a method to teach the compiler about this. Then this is the patch. After the compiler recognized that the coroutines can only be destroyed after complete, it can optimize the above example to: ```C++ void foo.destroy(foo.Frame frame) { frame->promise.~promise_type(); delete frame; } ``` I spent a lot of time experimenting and experiencing this in the downstream. The numbers are really good. In a real-world coroutine-heavy workload, the size of the build dir (including .o files) reduces 14%. And the size of final libraries (excluding the .o files) reduces 8% in Debug mode and 1% in Release mode.
2023-11-07	[NFC] Remove Type::getInt8PtrTy (#71029)	Paulo Matos	1	-1/+1
	Replace this with PointerType::getUnqual(). Followup to the opaque pointer transition. Fixes an in-code TODO item.
2023-11-01	[IR] Add writable attribute	Nikita Popov	1	-0/+2
	This adds a writable attribute, which in conjunction with dereferenceable(N) states that a spurious store of N bytes is introduced on function entry. This implies that this many bytes are writable without trapping or introducing data races. See https://llvm.org/docs/Atomics.html#optimization-outside-atomic for why the second point is important. This attribute can be added to sret arguments. I believe Rust will also be able to use it for by-value (moved) arguments. Rust likely won't be able to use it for &mut arguments (tree borrows does not appear to allow spurious stores). In this patch the new attribute is only used by LICM scalar promotion. However, the actual motivation for this is to fix a correctness issue in call slot optimization, which needs this attribute to avoid optimization regressions. Followup to the discussion on D157499. Differential Revision: https://reviews.llvm.org/D158081
2023-10-30	[IR] Add zext nneg flag (#67982)	Nikita Popov	1	-0/+22
	Add an nneg flag to the zext instruction, which specifies that the argument is non-negative. Otherwise, the result is a poison value. The primary use-case for the flag is to preserve information when sext gets replaced with zext due to range-based canonicalization. The nneg flag allows us to convert the zext back into an sext later. This is useful for some optimizations (e.g. a signed icmp can fold with sext but not zext), as well as some targets (e.g. RISCV prefers sext over zext). Discourse thread: https://discourse.llvm.org/t/rfc-add-zext-nneg-flag/73914 This patch is based on https://reviews.llvm.org/D156444 by @Panagiotis156, with some implementation simplifications and additional tests. --------- Co-authored-by: Panagiotis K <karouzakispan@gmail.com>
2023-10-18	[LLVM] Add new attribute `optdebug` to optimize for debugging (#66632)	Stephen Tozer	1	-0/+2
	This patch adds a new fn attribute, `optdebug`, that specifies that optimizations should make decisions that prioritize debug info quality, potentially at the cost of runtime performance. This patch does not add any functional changes triggered by this attribute, only the attribute itself. A subsequent patch will use this flag to disable the post-RA scheduler.
2023-09-30	[llvm] Remove uses of Type::getPointerTo() (NFC)	Youngsuk Kim	1	-1/+1
	* Remove if its sole use is to support an unnecessary ptr-to-ptr bitcast (remove the bitcast as well) * Replace with use of other APIs. NFC opaque pointer cleanup effort.
2023-09-01	[LTO] Remove module id from summary index	Teresa Johnson	1	-31/+46
	The module paths string table mapped to both an id sequentially assigned during LTO linking, and the module hash. The former is leftover from before the module hash was added for caching and subsequently replaced use of the module id when renaming promoted symbols (to avoid affects due to link order changes). The sequentially assigned module id was not removed, however, as it was still a convenience when serializing to/from bitcode and assembly. This patch removes the module id from this table, since it isn't strictly needed and can lead to confusion on when it is appropriate to use (e.g. see fix in D156525). It also takes a (likely not significant) amount of overhead. Where an integer module id is needed (e.g. bitcode writing), one is assigned on the fly. There are a couple of test changes since the paths are now sorted alphanumerically when assigning ids on the fly during assembly writing, in order to ensure deterministic behavior. Differential Revision: https://reviews.llvm.org/D156730
2023-07-14	[llvm] Remove uses of getNonOpaquePointerElementType() (NFC)	Nikita Popov	1	-12/+5

2023-07-05	[llvm] A Unified LTO Bitcode Frontend	Matthew Voss	1	-1/+4
	Here's a high level summary of the changes in this patch. For more information on rational, see the RFC. (https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774). - Add config parameter to LTO backend, specifying which LTO mode is desired when using unified LTO. - Add unified LTO flag to the summary index for efficiency. Unified LTO modules can be detected without parsing the module. - Make sure that the ModuleID is generated by incorporating more types of symbols. Differential Revision: https://reviews.llvm.org/D123803
2023-04-20	[ThinLTO] Remove BlockCount for non partial sample profile builds	Teresa Johnson	1	-4/+6
	As pointed out in https://discourse.llvm.org/t/undeterministic-thin-index-file/69985, the block count added to distributed ThinLTO index files breaks incremental builds on ThinLTO - if any linked file has a different number of BBs, then the accumulated sum placed in the index files will change, causing all ThinLTO backend compiles to be redone. The block count is only used for scaling of partial sample profiles, and was added in D80403 for D79831. This patch simply removes this field from the index files of non partial sample profile compiles, which is NFC on the output of the compiler. We subsequently need to see if this can be removed for partial sample profiles without signficant performance loss, or redesigned in a way that does not destroy caching. Differential Revision: https://reviews.llvm.org/D148746
2023-03-16	[ConstExpr] Remove select constant expression	Nikita Popov	1	-6/+0
	This removes the select constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. Uses of this expressions have already been removed in advance, so this just removes related infrastructure and updates tests. Differential Revision: https://reviews.llvm.org/D145382
2023-02-27	[Bitcode] Remove typed pointer abbreviation	Arthur Eubanks	1	-10/+1
	Since typed pointers are deprecated. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D144901
2023-02-24	IR: Add nofpclass parameter attribute	Matt Arsenault	1	-0/+2
	This carries a bitmask indicating forbidden floating-point value kinds in the argument or return value. This will enable interprocedural -ffinite-math-only optimizations. This is primarily to cover the no-nans and no-infinities cases, but also covers the other floating point classes for free. Textually, this provides a number of names corresponding to bits in FPClassTest, e.g. call nofpclass(nan inf) @must_be_finite() call nofpclass(snan) @cannot_be_snan() This is more expressive than the existing nnan and ninf fast math flags. As an added bonus, you can represent fun things like nanf: declare nofpclass(inf zero sub norm) float @only_nans() Compared to nnan/ninf: - Can be applied to individual call operands as well as the return value - Can distinguish signaling and quiet nans - Distinguishes the sign of infinities - Can be safely propagated since it doesn't imply anything about other operands. - Does not apply to FP instructions; it's not a flag This is one step closer to being able to retire "no-nans-fp-math" and "no-infs-fp-math". The one remaining situation where we have no way to represent no-nans/infs is for loads (if we wanted to solve this we could introduce !nofpclass metadata, following along with noundef/!noundef). This is to help simplify the GPU builtin math library distribution. Currently the library code has explicit finite math only checks, read from global constants the compiler driver needs to set based on the compiler flags during linking. We end up having to internalize the library into each translation unit in case different linked modules have different math flags. By propagating known-not-nan and known-not-infinity information, we can automatically prune the edge case handling in most functions if the function is only reached from fast math uses.
2023-02-07	[NFC][TargetParser] Remove llvm/ADT/Triple.h	Archibald Elliott	1	-1/+1
	I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.
2023-01-24	IR: Add atomicrmw uinc_wrap and udec_wrap	Matt Arsenault	1	-0/+4
	These are essentially add/sub 1 with a clamping value. AMDGPU has instructions for these. CUDA/HIP expose these as atomicInc/atomicDec. Currently we use target intrinsics for these, but those do no carry the ordering and syncscope. Add these to atomicrmw so we can carry these and benefit from the regular legalization processes.
2023-01-05	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part	serge-sans-paille	1	-2/+2
	Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955
2022-12-20	[IR] Add a target extension type to LLVM.	Joshua Cranmer	1	-0/+12
	Target-extension types represent types that need to be preserved through optimization, but otherwise are not introspectable by target-independent optimizations. This patch doesn't add any uses of these types by an existing backend, it only provides basic infrastructure such that these types would work correctly. Reviewed By: nikic, barannikov88 Differential Revision: https://reviews.llvm.org/D135202
2022-12-20	[Support] Move TargetParsers to new component	Archibald Elliott	1	-0/+1
	This is a fairly large changeset, but it can be broken into a few pieces: - `llvm/Support/TargetParser` are all moved from the LLVM Support component into a new LLVM Component called "TargetParser". This potentially enables using tablegen to maintain this information, as is shown in https://reviews.llvm.org/D137517. This cannot currently be done, as llvm-tblgen relies on LLVM's Support component. - This also moves two files from Support which use and depend on information in the TargetParser: - `llvm/Support/Host.{h,cpp}` which contains functions for inspecting the current Host machine for info about it, primarily to support getting the host triple, but also for `-mcpu=native` support in e.g. Clang. This is fairly tightly intertwined with the information in `X86TargetParser.h`, so keeping them in the same component makes sense. - `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains the target triple parser and representation. This is very intertwined with the Arm target parser, because the arm architecture version appears in canonical triples on arm platforms. - I moved the relevant unittests to their own directory. And so, we end up with a single component that has all the information about the following, which to me seems like a unified component: - Triples that LLVM Knows about - Architecture names and CPUs that LLVM knows about - CPU detection logic for LLVM Given this, I have also moved `RISCVISAInfo.h` into this component, as it seems to me to be part of that same set of functionality. If you get link errors in your components after this patch, you likely need to add TargetParser into LLVM_LINK_COMPONENTS in CMake. Differential Revision: https://reviews.llvm.org/D137838