rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
11 hours	Reapply "[Coroutines] Add llvm.coro.is_in_ramp and drop return value of ↵	Weibo He	2	-7/+7
	llvm.coro.end (#155339)" (#159278) As mentioned in #151067, current design of llvm.coro.end mixes two functionalities: querying where we are and lowering to some code. This patch separate these functionalities into independent intrinsics by introducing a new intrinsic llvm.coro.is_in_ramp. Update a test in inline/ML, Reapply #155339
15 hours	[Remarks] YAMLRemarkSerializer: Fix StringRef out-of-bounds read (#159759)	Tobias Stadler	1	-0/+30
	YAML IO `mapRequired` expects a null-terminated `const char *` Key, so we can't legally pass a StringRef to it. We should add StringRef Key support to YAML IO, but for now just copy the key into a correctly null-terminated string. Pull Request: https://github.com/llvm/llvm-project/pull/159759
18 hours	[SupportTests] Waive failures in ProgramEnvTest.TestExecuteEmptyEnvironment ↵	Martin Storsjö	1	-0/+7
	on MinGW (#160277) In MinGW build configurations, built executables often can end up depending on a DLL for libstdc++ or libc++. This DLL typicall isn't installed system wide, but is either installed in the same directory as the executables, or found through PATH. If this dependency DLL has to be found through PATH, this test fails when attempting to execute the SupportTests executable with an empty environment. Waive the failure to execute the executable in this case.
20 hours	[DataLayout][LangRef] Split non-integral and unstable pointer properties	Alexander Richardson	1	-5/+135
	This commit adds finer-grained versions of isNonIntegralAddressSpace() and isNonIntegralPointerType() where the current semantics prohibit introduction of both ptrtoint and inttoptr instructions. The current semantics are too strict for some targets (e.g. AMDGPU/CHERI) where ptrtoint has a stable value, but the pointer has additional metadata. Currently, marking a pointer address space as non-integral also marks it as having an unstable bitwise representation (e.g. when pointers can be changed by a copying GC). This property inhibits a lot of optimizations that are perfectly legal for other non-integral pointers such as fat pointers or CHERI capabilities that have a well-defined bitwise representation but can't be created with only an address. This change splits the properties of non-integral pointers and allows for address spaces to be marked as unstable or non-integral (or both) independently using the 'p' part of the DataLayout string. A 'u' following the p marks the address space as unstable and specifying a index width != representation width marks it as non-integral. Finally, we also add an 'e' flag to mark pointers with external state (such as the CHERI capability validity) state. These pointers require special handling of loads and stores in addition to being non-integral. This does not change the checks in any of the passes yet - we currently keep the existing non-integral behaviour. In the future I plan to audit calls to DL.isNonIntegral[PointerType]() and replace them with the DL.mustNotIntroduce{IntToPtr,PtrToInt}() checks that allow for more optimizations. RFC: https://discourse.llvm.org/t/rfc-finer-grained-non-integral-pointer-properties/83176 Reviewed By: nikic, krzysz00 Pull Request: https://github.com/llvm/llvm-project/pull/105735
21 hours	Revert "[DebugInfo][DwarfDebug] Separate creation and population of abstract ↵	Vladislav Dzhidzhoev	3	-36/+16
	subprogram DIEs" (#160349) Reverts llvm/llvm-project#159104 due to the issues reported in https://github.com/llvm/llvm-project/issues/160197.
24 hours	[MCA] Use Bare Reference for InstrPostProcess (#160229)	Aiden Grossman	1	-0/+6
	This patch makes it so that InstrPostProcess::postProcessInstruction takes in a reference to a mca::Instruction rather than a reference to a std::unique_ptr. Without this, InstrPostProcess cannot be used with MCA instruction recycling because it needs to be called on both newly created instructions and instructions that have been recycled. We only have access to a raw pointer for instructions that have been recycled rather than a reference to the std::unique_ptr that owns them. This patch adds a call in the existing instruction recycling unit test to ensure the API remains compatible with this use case.
32 hours	[RISCV] Add MC layer support for Andes XAndesVSIntH extension. (#159514)	Rux124	1	-0/+1
	Add MC layer support for Andes XAndesVSIntH extension. The spec is available at: https://github.com/andestech/andes-v5-isa/releases/tag/ast-v5_4_0-release
44 hours	[IR] Check identical alignment for atomic instructions (#155349)	Ellis Hoag	1	-0/+3
	I noticed that `hasSameSpecialState()` checks alignment for `load`/`store` instructions, but not for `cmpxchg` or `atomicrmw`, which I assume is a bug. It looks like alignment for these instructions were added in https://github.com/llvm/llvm-project/commit/74c723757e69fbe7d85e42527d07b728113699ae.
45 hours	[llvm][mustache] Pre-commit tests for Triple Mustache (#159182)	Paul Kirth	1	-0/+94
	Add XFAIL tests for Triple Mustache following the official spec. The tests pass by virtue of using EXPECT_NE, since GTEST doesn't support XFAIL.
47 hours	[Remarks] Restructure bitstream remarks to be fully standalone (#156715)	Tobias Stadler	5	-121/+131
	Currently there are two serialization modes for bitstream Remarks: standalone and separate. The separate mode splits remark metadata (e.g. the string table) from actual remark data. The metadata is written into the object file by the AsmPrinter, while the remark data is stored in a separate remarks file. This means we can't use bitstream remarks with tools like opt that don't generate an object file. Also, it is confusing to post-process bitstream remarks files, because only the standalone files can be read by llvm-remarkutil. We always need to use dsymutil to convert the separate files to standalone files, which only works for MachO. It is not possible for clang/opt to directly emit bitstream remark files in standalone mode, because the string table can only be serialized after all remarks were emitted. Therefore, this change completely removes the separate serialization mode. Instead, the remark string table is now always written to the end of the remarks file. This requires us to tell the serializer when to finalize remark serialization. This automatically happens when the serializer goes out of scope. However, often the remark file goes out of scope before the serializer is destroyed. To diagnose this, I have added an assert to alert users that they need to explicitly call finalizeLLVMOptimizationRemarks. This change paves the way for further improvements to the remark infrastructure, including more tooling (e.g. #159784), size optimizations for bitstream remarks, and more. Pull Request: https://github.com/llvm/llvm-project/pull/156715
2 days	[Support] Fix some warnings in LSP Transport (#160010)	Alexandre Ganea	1	-3/+3
	When building with latest MSVC on Windows, this fixes some compile-time warnings from last week's integration in https://github.com/llvm/llvm-project/pull/157885: ``` [321/5941] Building CXX object lib\Support\LSP\CMakeFiles\LLVMSupportLSP.dir\Transport.cpp.obj C:\git\llvm-project\llvm\lib\Support\LSP\Transport.cpp(123): warning C4930: 'std::lock_guard<std::mutex> responseHandlersLock(llvm::lsp::MessageHandler::ResponseHandlerTy)': prototyped function not called (was a variable definition intended?) [384/5941] Building CXX object unittests\Support\LSP\CMakeFiles\LLVMSupportLSPTests.dir\Transport.cpp.obj C:\git\llvm-project\llvm\unittests\Support\LSP\Transport.cpp(190): warning C4804: '+=': unsafe use of type 'bool' in operation ```
2 days	[Driver][Hurd] Add AArch64 and RISCV64 support (#157212)	Brad Smith	1	-0/+12

5 days	Revert "[ELF][LLDB] Add an nvsass triple (#159459)" (#159879)	Joseph Huber	1	-1/+1
	Summary: This patch has broken the `libc` build bot. I could work around that but the changes seem unnecessary. This reverts commit 9ba844eb3a21d461c3adc7add7691a076c6992fc.
5 days	[MCA] Enable customization of individual instructions (#155420)	Roman Belenov	3	-9/+44
	Currently MCA takes instruction properties from scheduling model. However, some instructions may execute differently depending on external factors - for example, latency of memory instructions may vary differently depending on whether the load comes from L1 cache, L2 or DRAM. While MCA as a static analysis tool cannot model such differences (and currently takes some static decision, e.g. all memory ops are treated as L1 accesses), it makes sense to allow manual modification of instruction properties to model different behavior (e.g. sensitivity of code performance to cache misses in particular load instruction). This patch addresses this need. The library modification is intentionally generic - arbitrary modifications to InstrDesc are allowed. The tool support is currently limited to changing instruction latencies (single number applies to all output arguments and MaxLatency) via coments in the input assembler code; the format is the like this: add (%eax), eax // LLVM-MCA-LATENCY:100 Users of MCA library can already make additional customizations; command line tool can be extended in the future. Note that InstructionView currently shows per-instruction information according to scheduling model and is not affected by this change. See https://github.com/llvm/llvm-project/issues/133429 for additional clarifications (including explanation why existing customization mechanisms do not provide required functionality) --------- Co-authored-by: Min-Yih Hsu <min@myhsu.dev>
5 days	[ELF][LLDB] Add an nvsass triple (#159459)	Walter Erquinigo	1	-1/+1
	When handling CUDA ELF files via objdump or LLDB, the ELF parser in LLVM needs to distinguish if an ELF file is sass or not, which requires a triple for sass to exist in llvm. This patch includes all the necessary changes for LLDB and objdump to correctly identify these files with the correct triple.
5 days	[LLVM][SCEV] Look through common vscale multiplicand when simplifying ↵	Paul Walker	1	-0/+137
	compares. (#141798) My usecase is simplifying the control flow generated by LoopVectorize when vectorising loops whose tripcount is a function of the runtime vector length. This can be problematic because: * CSE is a pre-LoopVectorize transform and so it's common for an IR function to include several calls to llvm.vscale(). (NOTE: Code generation will typically remove the duplicates) * Pre-LoopVectorize instcombines will rewrite some multiplies as shifts. This leads to a mismatch between VL based maths of the scalar loop and that created for the vector loop, which prevents some obvious simplifications. SCEV does not suffer these issues because it effectively does CSE during construction and shifts are represented as multiplies.
5 days	[llvm][test][CGPluginTest] Add back missing TargetParser dependency (#159760)	Raul Tambre	1	-0/+1
	Din't seem to be used, but is. [737/738] Linking CXX executable unittests/CodeGen/CGPluginTest/CGPluginTest FAILED: unittests/CodeGen/CGPluginTest/CGPluginTest : && /usr/bin/c++ -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-array-bounds -Wno-stringop-overread -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,--export-dynamic -Wl,--gc-sections unittests/CodeGen/CGPluginTest/CMakeFiles/CGPluginTest.dir/PluginTest.cpp.o unittests/CodeGen/CGPluginTest/CMakeFiles/CGPluginTest.dir/Plugin/CodeGenTestPass.cpp.o -o unittests/CodeGen/CGPluginTest/CGPluginTest -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/lib lib/libLLVMX86CodeGen.so.22.0git lib/libLLVMX86AsmParser.so.22.0git lib/libLLVMX86De sc.so.22.0git lib/libLLVMX86Disassembler.so.22.0git lib/libLLVMX86Info.so.22.0git lib/libLLVMAMDGPUCodeGen.so.22.0git lib/libLLVMAMDGPUAsmParser.so.22.0git lib/libLLVMAMDGPUDisassembler.so.22.0git lib/libllvm_gtest_main.so.22.0git lib/libLLVMTestingSupport.so.22.0git lib/libLLVMCodeGen.so.22.0git lib/libLLVMTarget.so.22.0git lib/libLLVMAMDGPUDesc.so.22.0git lib/libLLVMAMDGPUInfo.so.22.0git lib/libLLVMAMDGPUUtils.so.22.0git lib/libLLVMCore.so.22.0git lib/libLLVMMC.so.22.0git lib/libllvm_gtest.so.22.0git lib/libLLVMSupport.so.22.0git -Wl,-rpath-link,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/lib && : /usr/bin/ld: unittests/CodeGen/CGPluginTest/CMakeFiles/CGPluginTest.dir/PluginTest.cpp.o: undefined reference to symbol '_ZN4llvm6TripleC1ERKNS_5TwineES3_S3_' Fixes: 4e1c996674cc340f290b0a528e2038e76494d8d4
5 days	[llvm][test][CGPluginTest] Keep plugin in shared library directory	Raul Tambre	3	-24/+5
	Scoping to the root build directory instead of using the path directly is awkward and the only such occurrence in the test suite. It's also prone to breakage for downstreams that change the library path. But it's not even necessary: during build we have the appropriate RPATHs set so we can just depend on the dynamic loader to find it. This extra logic is probably just copy-paste from PluginsTest.cpp. Additionally: * Removed TargetParser as a dependency because it doesn't seem to actually be used. * Moved `add_dependencies()` to `DEPENDS` to better match the rest of LLVM.
5 days	[RISCV] Implement MC support for Zvfofp8min extension (#157014)	Jim Lin	1	-0/+1
	This patch adds MC support for Zvfofp8min https://github.com/aswaterman/riscv-misc/blob/main/isa/zvfofp8min.adoc.
6 days	[llvm][clang] Pass VFS to `llvm::cl` command line handling (#159174)	Jan Svoboda	1	-6/+6
	This PR passes the VFS down to `llvm::cl` functions so that they don't assume the real file system.
6 days	[test][CAS] Fix unused variable warning in unittest (#159594)	Steven Wu	1	-2/+2
	Fix unused variable warning blocking AIX bot.
6 days	[LV] Provide utility routine to find uncounted exit recipes (#152530)	Graham Hunter	3	-2/+105
	Splitting out just the recipe finding code from #148626 into a utility function (along with the extra pattern matchers). Hopefully this makes reviewing a bit easier. Added a gtest, since this isn't actually used anywhere yet.
7 days	[ADT] Fix llvm::concat_iterator for `ValueT == common_base_class *` (#144744)	Javier Lopez-Gomez	1	-0/+48
	Fix `llvm::concat_iterator` for the case of `ValueT` being a pointer to a common base class to which the result of dereferencing any iterator in `ItersT` can be casted to.
7 days	[DebugInfo][DwarfDebug] Separate creation and population of abstract ↵	Vladislav Dzhidzhoev	3	-16/+36
	subprogram DIEs (#159104) With this change, construction of abstract subprogram DIEs is split in two stages/functions: creation of DIE (in DwarfCompileUnit::getOrCreateAbstractSubprogramDIE) and its population with children (in DwarfCompileUnit::constructAbstractSubprogramScopeDIE). With that, abstract subprograms can be created/referenced from DwarfDebug::beginModule, which should solve the issue with static local variables DIE creation of inlined functons with optimized-out definitions. It fixes https://github.com/llvm/llvm-project/issues/29985. LexicalScopes class now stores mapping from DISubprograms to their corresponding llvm::Function's. It is supposed to be built before processing of each function (so, now LexicalScopes class has a method for "module initialization" alongside the method for "function initialization"). It is used by DwarfCompileUnit to determine whether a DISubprogram needs an abstract DIE before DwarfDebug::beginFunction is invoked. DwarfCompileUnit::getOrCreateSubprogramDIE method is added, which can create an abstract or a concrete DIE for a subprogram. It accepts llvm::Function* argument to determine whether a concrete DIE must be created. This is a temporary fix for https://github.com/llvm/llvm-project/issues/29985. Ideally, it will be fixed by moving global variables and types emission to DwarfDebug::endModule (https://reviews.llvm.org/D144007, https://reviews.llvm.org/D144005). Some code proposed by Ellis Hoag <ellis.sparky.hoag@gmail.com> in https://github.com/llvm/llvm-project/pull/90523 was taken for this commit.
7 days	Revert "Reapply "[Coroutines] Add llvm.coro.is_in_ramp and drop return value ↵	Weibo He	1	-4/+4
	of llvm.coro.end #153404"" (#159236) Reverts llvm/llvm-project#155339 because of CI fail
7 days	Reapply "[Coroutines] Add llvm.coro.is_in_ramp and drop return value of ↵	Weibo He	1	-4/+4
	llvm.coro.end #153404" (#155339) As mentioned in #151067, current design of llvm.coro.end mixes two functionalities: querying where we are and lowering to some code. This patch separate these functionalities into independent intrinsics by introducing a new intrinsic llvm.coro.is_in_ramp.
8 days	Re-apply "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional ↵	Mingming Liu	2	-0/+81
	update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159161) This is a reland of https://github.com/llvm/llvm-project/pull/158460 Test failures are gone once I undo the changes in codegenprepare.
8 days	Revert "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional ↵	Mingming Liu	2	-81/+0
	update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159159) Reverts llvm/llvm-project#158460 due to buildbot failures
8 days	Fix ExecuteAndWait with empty environment on Windows (#158719)	Hiroshi Yamauchi	1	-0/+18
	CreateProcessW requires that the environemnt block to be always double null-terminated even with an empty environemnt. https://learn.microsoft.com/en-us/windows/win32/procthread/environment-variables The attached test fails this way without the fix. C:\Users\hiroshi\upstream\llvm-project\llvm\unittests\Support\ProgramTest.cpp(697): error: Value of: ExecutionFailed Actual: true Expected: false Couldn't execute program 'C:\Users\hiroshi\upstream\llvm-project\build\unittests\Support\SupportTests.exe': The parameter is incorrect. (0x57)
8 days	[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if ↵	Mingming Liu	2	-0/+81
	existing prefix is not equivalent to the new one. Returns whether prefix changed. (#158460) Before this change, `setSectionPrefix` overwrites existing section prefix with new one unconditionally. After this change, `setSectionPrefix` checks for equivalences, updates conditionally and returns whether an update happens. Update the existing callers to make use of the return value. [PR 155337](https://github.com/llvm/llvm-project/pull/155337/files#diff-cc0c67ac89807f4453f0cfea9164944a4650cd6873a468a0f907e7158818eae9) is a motivating use case whether the 'update' semantic is needed.
8 days	[ADT] Fix an indexing bug in PackedVector (#158785)	Kazu Hirata	1	-0/+10
	PackedVector is like std::vector<int> except that we can store small elements (e.g. 2-bit elements) in a packed manner using a BitVector as the underlying storage. The problem is that for bit size 3 and beyond, the calculation of indices into the underlying BitVector is not correct. For example, around line 50, we see a "for" loop to retrieve an unsigned integer value: for (unsigned i = 0; i != BitNum-1; ++i) val = T(val \| ((Bits[(Idx << (BitNum-1)) + i] ? 1UL : 0UL) << i)); Suppose that BitNum is 4 (that is, 4-bit item). Here is the mapping between the PackedVector index and the corresponding BitVector indices. Idx 0: 0, 1, 2, 3 Idx 1: 8, 9, 10, 11 Idx 2: 16, 17, 18, 19 That is, we use 4 bits out of every 8 bits. This is because the index calculation uses "<<". The index should really be Idx * BitNum + i. FWIW, all the methods in PackedVector consistently use the shift-based index calculation, so the user would never encounter a bug except possibly as excessive storage use. This patch fixes the index calculation. Now, in size(), I didn't want to do integer division: return Bits.size() / BitNum; so this patch adds a separate variable NumElements to keep track of the number of elements. The unit test checks for the expected size of the underlying BitVector.
8 days	[ADT] Wrapper for `std::accumulate` accepting a `range`. (#158702)	Mircea Trofin	1	-0/+10

8 days	Add DebugSSAUpdater class to track debug value liveness (#135349)	Stephen Tozer	2	-0/+220
	This patch adds a class that uses SSA construction, with debug values as definitions, to determine whether and which debug values for a particular variable are live at each point in an IR function. This will be used by the IR reader of llvm-debuginfo-analyzer to compute variable ranges and coverage, although it may be applicable to other debug info IR analyses.
9 days	[Dwarf] Support heterogeneous DW_{OP,AT}s needed for AMDGPU CFI (#153883)	Scott Linder	1	-0/+6
	These are defined in the user range until standard versions of them get adopted into dwarf, which is expected in DWARF6. Some of these amount to reservations currently as no code to use them is included. It would be very helpful to get them committed to avoid conflicts necessitating encoding changes while we are in the process of upstreaming. --------- Co-authored-by: Juan Martinez Fernandez <juamarti@amd.com> Co-authored-by: Emma Pilkington <Emma.Pilkington@amd.com>
9 days	[VPlan] Match more GEP-like in m_GetElementPtr (#158019)	Ramkumar Ramachandra	1	-0/+24
	The m_GetElementPtr matcher is incorrect and incomplete. Fix it to match all possible GEPs to avoid misleading users. It currently just has one use, and the change is non-functional for that use.
9 days	[CAS] Add MappedFileRegionArena (#114099)	Steven Wu	2	-0/+244
	Add MappedFileRegionArena which can be served as a file system backed persistent memory allocator. The allocator works like a BumpPtrAllocator, and is designed to be thread safe and process safe. The implementation relies on the POSIX compliance of file system and doesn't work on all file systems. If the file system supports lazy tail (doesn't allocate disk space if the tail of the large file is not used), user has more flexibility to declare a larger capacity. The allocator works by using a atomically updated bump ptr at a location that can be customized by the user. The atomic pointer points to the next available space to allocate, and the allocator will resize/truncate to current usage once all clients closed the allocator. Windows implementation contributed by: @hjyamauchi
9 days	Default DEBUG_TYPE to the current filename for logging (#158494)	Mehdi Amini	1	-23/+97
	This makes it optional to define a debug type and uses the current FileName instead. This both reduced the size of the prefix printed by LDBG() and makes it possible to pass a filename to `--debug-only` to filter on.
9 days	[GlobalISel] Remove GI known bits cache (#157352)	David Green	2	-2/+2
	There is a cache on the known-bit computed by global-isel. It only works inside a single query to computeKnownBits, which limits its usefulness, and according to the tests can sometimes limit the effectiveness of known-bits queries. (Although some AMD tests look longer). Keeping the cache valid and clearing it at the correct times can also require being careful about the functions called inside known-bits queries. I measured compile-time of removing it and came up with: ``` 7zip 2.06405E+11 2.06436E+11 0.015018992 Bullet 1.01298E+11 1.01186E+11 -0.110236169 ClamAV 57942466667 57848066667 -0.16292023 SPASS 45444466667 45402966667 -0.091320249 consumer 35432466667 35381233333 -0.144594317 kimwitu++ 40858833333 40927933333 0.169118877 lencod 70022366667 69950633333 -0.102443457 mafft 38439900000 38413233333 -0.069372362 sqlite3 35822266667 35770033333 -0.145812474 tramp3d 82083133333 82045600000 -0.045726 Average -0.068828739 ``` The last column is % difference between with / without the cache. So in total it seems to be costing slightly more to keep the current known-bits cache than if it was removed. (Measured in instruction count, similar to llvm-compile-time-tracker). The hit rate wasn't terrible - higher than I expected. In the llvm-test-suite+external projects it was hit 4791030 times out of 91107008 queries, slightly more than 5%. Note that as globalisel increases in complexity, more known bits calls might be made and the numbers might shift. If that is the case it might be better to have a cache that works across calls, providing it doesn't make effectiveness worse.
9 days	[llvm] Use std::bool_constant (NFC) (#158520)	Kazu Hirata	1	-3/+2
	This patch replaces, std::integral_constant<bool, ...> with std::bool_constant for brevity. Note that std::bool_constant was introduced as part of C++17. There are cases where we could replace EXPECT_EQ(false, ...) with EXPECT_FALSE(...), but I'm not doing that in this patch to avoid doing multiple things in one patch.
9 days	[ADT] Handle uint8_t and uint16_t in countr_zero (#158518)	Kazu Hirata	1	-0/+8
	Without this patch, the uint8_t and uint16_t cases are sent to the fallback route. This patch fixes that by relaxing the "if" condition. While it's hard to test that the correct control path is taken within countr_zero, this patch adds a few tests just to verify the correctness on uint8_t and uint16_t inputs.
10 days	[ADT] Fix the initial size calculation of SmallDenseMap (#158458)	Kazu Hirata	1	-0/+69
	The initial size calculation of SmallDenseMap is strange in several ways: - SmallDenseMap(unsigned) seems to want to take the number of initial buckets as far as I can tell from the variable name NumInitBuckets. In contrast, DenseMap(unsigned) seems to want to take the number of initial entries as far as I can tell from the comment: /// Create a DenseMap with an optional \p InitialReserve that guarantee that /// this number of elements can be inserted in the map without grow() - SmallDenseMap(unsigned) uses llvm::bit_ceil to obtain a power of two. SmallDenseMap(I, E) uses NextPowerOf2 to obtain a power of two. - Presumably, the init() call is to ensure that we won't call grow() while populating the initial elements [I, E). However, NextPowerOf2(std::distance(I, E)) does not ensure that a rehash won't happen. For example, if the number of initial elements is 50, we need 128 buckets, but NextPowerOf2(std::distance(I, E)) would return 64. This patch fixes all these inconsistencies by teaching SmallDenseMap::init to call BaseT::getMinBucketToReserveForEntries just like DenseMap::init. With this patch, all constructors of SmallDenseMap are textually identical to their respective counterparts in DenseMap.
12 days	[DirectX] Updating Root Signature YAML representation to use Enums instead ↵	joaosaffran	1	-21/+21
	of uint (#154827) This PR is updating Root Signature YAML to use enums, this is a required change to remove the use of to_underlying from DirectXContainer binary file. Closes: [#150676](https://github.com/llvm/llvm-project/issues/150676)
12 days	Introduce LDBG_OS() macro as a variant of LDBG() (#158277)	Mehdi Amini	1	-10/+119
	Also, improve LDBG() to accept debug type and level in any order, and add unit-tests for LDBG() and LGDB_OS(). LDBG_OS() is a macro that behaves like LDBG() but instead of directly using it to stream the output, it takes a callback function that will be called with a raw_ostream. This is a re-land with workarounds for older gcc and clang versions. Previous attempts in #157194 and #158260 Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>
12 days	Revert "Introduce LDBG_OS() macro as a variant of LDBG() (#157194)" (#158264)	Mehdi Amini	1	-119/+10
	Reverts llvm/llvm-project#158260 second attempt to land this fixed some bots, but left others broken, need an extra iteration!
12 days	Introduce LDBG_OS() macro as a variant of LDBG() (#157194) (#158260)	Mehdi Amini	1	-10/+119
	Also, improve LDBG() to accept debug type and level in any order, and add unit-tests for LDBG() and LGDB_OS(). LDBG_OS() is a macro that behaves like LDBG() but instead of directly using it to stream the output, it takes a callback function that will be called with a raw_ostream. Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com> Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>
12 days	[RISCV][MC] Add MC support of Zibi experimental extension (#127463)	Boyao Wang	1	-0/+1
	This adds the MC support of Zibi v0.1 experimental extension. References: * https://lf-riscv.atlassian.net/wiki/spaces/USXX/pages/599261201/Branch+with+Immediate+Zibi+Ratification+Plan * https://lf-riscv.atlassian.net/browse/RVS-3828 * https://github.com/riscv/zibi/releases/tag/v0.1.0
12 days	[Support] Deprecate one form of support::endian::write (NFC) (#156140)	Kazu Hirata	1	-7/+7
	We have two forms of write: template <typename value_type, std::size_t alignment = unaligned> inline void write(void memory, value_type value, endianness endian) template <typename value_type, endianness endian, std::size_t alignment> inline void write(void memory, value_type value) The difference is that endian is a function parameter in the former but a template parameter in the latter. This patch streamlines the code by migrating the use of the latter to the former while deprecating the latter. I'm planning to do the same for byte_swap and read in follow-up patches to keep this patch simple and small.
13 days	[DirectX] Removing dxbc StaticSampler from mcbxdc (#154631)	joaosaffran	1	-2/+2
	MC Static Samplers Representation currently depends on Object structures. This PR removes that dependency and in order to facilitate removing to_underlying usage in follow-up PRs.
13 days	[LLVM][Coverage][Unittest] Fix dangling reference in unittest (#147118)	Tomohiro Kashiwada	1	-9/+9
	In loop of `writeAndReadCoverageRegions`, `OutputFunctions[I].Filenames` references to contents of `Filenames` after returning from `readCoverageRegions` but `Filenames` will be cleared in next call of `readCoverageRegions`, causes dangling reference. The lifetime of the contents of `Filenames` must be equal or longer than `OutputFunctions[I]`, thus it has been moved into `OutputFunctions[I]` (typed `OutputFunctionCoverageData`).
13 days	[PGO] Add llvm.loop.estimated_trip_count metadata (#152775)	Joel E. Denny	1	-0/+53
	This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As the RFC explains, that metadata enables future patches, such as PR #128785, to fix block frequency issues without losing estimated trip counts.