rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-02-22	[NVPTX] fixup support for unaligned parameters and returns (#82562)	Alex MacLean	5	-39/+730
	Add support for unaligned parameters and return values. These must be loaded and stored one byte at a time and then bit manipulation is used to assemble the correct final result.
2024-02-22	[RISCV] Vector sub (zext, zext) -> sext (sub (zext, zext)) (#82455)	Philip Reames	2	-17/+40
	This is legal as long as the inner zext retains at least one bit of increase so that the sub overflow case (0 - UINT_MAX) can be represented. Alive2 proof: https://alive2.llvm.org/ce/z/BKeV3W For RVV, restrict this to power of two sizes with the operation type being at least e8 to stick to legal extends. We could arguably handle i1 source types with some care if we wanted to. This is likely profitable because it may allow us to perform the sub instruction in a narrow LMUL (equivalently, in fewer DLEN-sized pieces) before widening for the user. We could arguably avoid narrowing below DLEN, but the transform should at worst introduce one extra extend and one extra vsetvli toggle if the source could previously be handled via loads explicit w/EEW.
2024-02-22	[AMDGPU][NFC] Refactor SIInsertWaitcnts zero waitcnt generation (#82575)	vangthao95	2	-15/+22
	Move the allZero* waitcnt generation methods into WaitcntGenerator class.
2024-02-22	[gn build] Port aaf2d078b622	LLVM GN Syncbot	1	-0/+1

2024-02-22	[Hexagon] Clean up redundant transfer instructions. (#82663)	Sumanth Gundapaneni	7	-7/+366
	This patch adds a Hexagon specific backend pass that cleans up redundant transfers after register allocation.
2024-02-23	[mlir][test] Fix -Wunused-variable in PassBuilderCallbacksTest.cpp (NFC)	Jie Fu	1	-1/+1
	llvm-project/llvm/unittests/MIR/PassBuilderCallbacksTest.cpp:333:10: error: unused variable 'Ret' [-Werror,-Wunused-variable] bool Ret = MIR->parseMachineFunctions(*Mod, MMI); ^ 1 error generated.
2024-02-22	[CodeGen][MIR][UnitTests] Fix shared build. NFC	Michael Liao	1	-0/+1

2024-02-22	[LSR] Clear SCEVExpander before calling DeleteDeadPHIs	Nikita Popov	2	-3/+44
	To avoid an assertion failure when an AssertingVH is removed, as reported in: https://github.com/llvm/llvm-project/pull/82362#issuecomment-1960067147 Also remove an unnecessary use of SCEVExpanderCleaner.
2024-02-22	[libc] Rework the GPU build to be a regular target (#81921)	Joseph Huber	3	-4/+18
	Summary: This is a massive patch because it reworks the entire build and everything that depends on it. This is not split up because various bots would fail otherwise. I will attempt to describe the necessary changes here. This patch completely reworks how the GPU build is built and targeted. Previously, we used a standard runtimes build and handled both NVPTX and AMDGPU in a single build via multi-targeting. This added a lot of divergence in the build system and prevented us from doing various things like building for the CPU / GPU at the same time, or exporting the startup libraries or running tests without a full rebuild. The new appraoch is to handle the GPU builds as strict cross-compiling runtimes. The first step required https://github.com/llvm/llvm-project/pull/81557 to allow the `LIBC` target to build for the GPU without touching the other targets. This means that the GPU uses all the same handling as the other builds in `libc`. The new expected way to build the GPU libc is with `LLVM_LIBC_RUNTIME_TARGETS=amdgcn-amd-amdhsa;nvptx64-nvidia-cuda`. The second step was reworking how we generated the embedded GPU library by moving it into the library install step. Where we previously had one `libcgpu.a` we now have `libcgpu-amdgpu.a` and `libcgpu-nvptx.a`. This patch includes the necessary clang / OpenMP changes to make that not break the bots when this lands. We unfortunately still require that the NVPTX target has an `internal` target for tests. This is because the NVPTX target needs to do LTO for the provided version (The offloading toolchain can handle it) but cannot use it for the native toolchain which is used for making tests. This approach is vastly superior in every way, allowing us to treat the GPU as a standard cross-compiling target. We can now install the GPU utilities to do things like use the offload tests and other fun things. Some certain utilities need to be built with `--target=${LLVM_HOST_TRIPLE}` as well. I think this is a fine workaround as we will always assume that the GPU `libc` is a cross-build with a functioning host. Depends on https://github.com/llvm/llvm-project/pull/81557
2024-02-22	[gn build] Port df6f756a1927	LLVM GN Syncbot	1	-0/+1

2024-02-22	Extend GCC workaround to GCC < 8.4 for llvm::iterator_range ctor (#82643)	Thomas Preud'homme	1	-2/+2
	GCC SFINAE error with decltype was fixed in commit ac5e28911abdfb8d9bf6bea980223e199bbcf28d which made it into GCC 8.4. Therefore adjust GCC version test accordingly.
2024-02-22	[NewPM/CodeGen] Rewrite pass manager nesting (#81068)	Arthur Eubanks	13	-628/+681
	Currently the new PM infra for codegen puts everything into a MachineFunctionPassManager. The MachineFunctionPassManager owns both Module passes and MachineFunction passes, and batches adjacent MachineFunction passes like a typical PassManager. The current MachineFunctionAnalysisManager also directly references a module and function analysis manager to get results. The initial argument was that the codegen pipeline is relatively "flat", meaning it's mostly machine function passes with a couple of module passes here and there. However, there are a couple of issues with this as compared to a more structured nesting more like the optimization pipeline. For example, it doesn't allow running function passes then machine function passes on a function and its machine function all at once. It also currently requires the caller to split out the IR passes into one pass manager and the MIR passes into another pass manager. This patch rewrites the new pass manager infra for the codegen pipeline to be more similar to the nesting in the optimization pipeline. Basically, a Function contains a MachineFunction. So we can have Module -> Function -> MachineFunction adaptors. It also rewrites the analysis managers to have inner/outer proxies like the ones in the optimization pipeline. The new pass managers/adaptors/analysis managers can be seen in use in PassManagerTest.cpp. This allows us to consolidate to just having to add to one ModulePassManager when using the codegen pipeline. I haven't added the Function -> MachineFunction adaptor in this patch, but it should be added when we merge AddIRPass/AddMachinePass so that we can run IR and MIR passes on a function before proceeding to the next function. The MachineFunctionProperties infra for MIR verification is still WIP.
2024-02-22	LoopVectorize/test: guard pr72969 with asserts (#82653)	Ramkumar Ramachandra	1	-0/+1
	Follow up on 695a9d8 (LoopVectorize: add test for crash in #72969) to guard pr72969.ll with REQUIRES: asserts, in order to be reasonably confident that it will crash reliably.
2024-02-23	[InstCombine] Add support for cast instructions in `getFreelyInvertedImpl` ↵	Yingwei Zheng	2	-0/+103
	(#82451) This patch adds support for cast instructions in `getFreelyInvertedImpl` to enable more optimizations. Alive2: https://alive2.llvm.org/ce/z/F6maEE
2024-02-22	[SLP]Improve findReusedOrderedScalars and graph rotation.	Alexey Bataev	13	-175/+447
	Patch syncs the code in findReusedOrderedScalars with cost estimation/codegen. It tries to use similar logic to better determine best order. Before, it just tried to find previously vectorized node without checking if it is possible to use the vectorized value in the shuffle. Now it relies on the more generalized version. If it determines, that a single vector must be reordered (using same mechanism, as codegen and cost estimation), it generates better order. The comparison between new/ref ordering: Metric: SLP.NumVectorInstructions Program SLP.NumVectorInstructions results results0 diff test-suite :: MultiSource/Benchmarks/nbench/nbench.test 139.00 140.00 0.7% test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test 344.00 346.00 0.6% test-suite :: MultiSource/Benchmarks/FreeBench/pifft/pifft.test 1293.00 1292.00 -0.1% test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test 5176.00 5170.00 -0.1% test-suite :: External/SPEC/CFP2006/453.povray/453.povray.test 5173.00 5167.00 -0.1% test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 11692.00 11660.00 -0.3% test-suite :: External/SPEC/CINT2006/464.h264ref/464.h264ref.test 1621.00 1615.00 -0.4% test-suite :: External/SPEC/CINT2006/403.gcc/403.gcc.test 795.00 792.00 -0.4% test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 26499.00 26338.00 -0.6% test-suite :: MultiSource/Benchmarks/Bullet/bullet.test 7343.00 7281.00 -0.8% test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test 1104.00 1094.00 -0.9% test-suite :: MultiSource/Applications/JM/lencod/lencod.test 2216.00 2180.00 -1.6% test-suite :: External/SPEC/CFP2006/433.milc/433.milc.test 787.00 637.00 -19.1% Less 0% is better. Most of the benchmarks see more vectorized code. The first ones just have shuffles removed. The ordering analysis still may require some improvements (e.g. for alternate nodes), but this one should be produce better results. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/77529
2024-02-22	LoopVectorize: Mark crash test as requiring assertions	Benjamin Kramer	1	-0/+1

2024-02-22	[AArch64][CodeGen] Fix crash when fptrunc returns fp16 with +nofp attr (#81724)	Nashe Mncube	3	-22/+138
	When performing lowering of the fptrunc opcode returning fp16 with the +nofp flag enabled we could trigger a compiler crash. This is because we had no custom lowering implemented. This patch the case in which we need to promote an fp16 return type for fptrunc when the +nofp attr is enabled.
2024-02-22	Enable JumpTableToSwitch pass by default (#82546)	Alexander Shaposhnikov	8	-6/+8
	Enable JumpTableToSwitch pass by default. Test plan: ninja check-all
2024-02-23	[CVP] Canonicalize signed minmax into unsigned (#82478)	Yingwei Zheng	2	-12/+86
	This patch turns signed minmax to unsigned to match the behavior for signed icmps. Alive2: https://alive2.llvm.org/ce/z/UAAM42
2024-02-22	GSym aggregated output to JSON file (#81763)	Kevin Frei	2	-0/+32
	In order to make tooling around dwarf health easier, I've added an `--json-summary-file` option to `llvm-gsymutil` that will spit out error summary data with counts to a JSON file. I've added the same capability to `llvm-dwarfdump` in a [different PR.](https://github.com/llvm/llvm-project/pull/81762) The format of the json is: ```JSON { "error-categories": { "<first category description>": {"count": 1234}, "<next category description>": {"count":4321} }, "error-count": 5555 } ``` for a clean run: ```JSON { "error-categories": {}, "error-count": 0 } ``` --------- Co-authored-by: Kevin Frei <freik@meta.com>
2024-02-22	[lldb][llvm] Return an error instead of crashing when parsing a line table ↵	Greg Clayton	2	-4/+22
	prologue. (#80769) We recently ran into some bad DWARF where the `DW_AT_stmt_list` of many compile units was randomly set to invalid values and was causing LLDB to crash due to an assertion about address sizes not matching. Instead of asserting, we should return an appropriate recoverable `llvm::Error`.
2024-02-22	[DirectX][NFC] Use LLVM Types in DXIL Operation specifications in DXIL.td ↵	S. Bharadwaj Yadavalli	2	-90/+69
	(#81692) This change uniformly uses LLVM Types in the specification of parameter types and overload types of DXIL operation. Updated (a) parameter types accordingly in the specification of existing DXILOperations and (b) DXILEmitter.
2024-02-23	[LTO] Remove Config.UseDefaultPipeline (#82587)	Igor Kudrin	2	-5/+0
	This option is not used. It was added in [D122133](https://reviews.llvm.org/D122133), 5856f30b, with the only usage in `ClangLinkerWrapper.cpp`, which was later updated in a1d57fc2, and then finally removed in [D142650](https://reviews.llvm.org/D142650), 6185246f.
2024-02-22	[HCS] Fix unused variable warnings. NFCI.	Benjamin Kramer	1	-4/+5

2024-02-23	[Loads] Fix crash in isSafeToLoadUnconditionally with scalable accessed type ↵	Luke Lau	2	-3/+22
	(#82650) This fixes #82606 by updating isSafeToLoadUnconditionally to handle fixed sized loads from a scalable accessed type.
2024-02-22	[HEXAGON] Fix bit boundary for isub_hi in HexagonBitSimplify (#82336)	yandalur	2	-1/+23
	Use bit boundary of 32 for high subregisters in HexagonBitSimplify. This fixes the subregister used in an upper half register store.
2024-02-22	[CodeGen] Clean up MachineFunctionSplitter MBB safety checking (NFC)	Daniel Hoekwater	1	-8/+3
	Move the "is MBB safe to split" check out of `isColdBlock` and update the comment since we're no longer using a temporary hack.
2024-02-22	[llvm-readobj,ELF] Support --decompress/-z (#82594)	Fangrui Song	10	-6/+209
	When a section has the SHF_COMPRESSED flag, -p/-x dump the compressed content by default. In GNU readelf, if --decompress/-z is specified, -p/-x will dump the decompressed content. This patch implements the option. Close #82507
2024-02-23	[ConstraintElim] Decompose sext-like insts for signed predicates (#82344)	Yingwei Zheng	3	-35/+71
	Alive2: https://alive2.llvm.org/ce/z/A8dtGp Fixes #82271.
2024-02-22	[RISCV] Enable -riscv-enable-sink-fold by default. (#82026)	Craig Topper	6	-19/+19
	AArch64 has had it enabled since late November, so hopefully the main issues have been resolved. I see a small reduction in dynamic instruction count on every benchmark in specint2017. The best improvement was 0.3% so nothing amazing.
2024-02-22	[DAGCombiner][RISCV] CSE zext nneg and sext. (#82597)	Craig Topper	2	-46/+30
	If we have a sext and a zext nneg with the same types and operand we should combine them into the sext. We can't go the other way because the nneg flag may only be valid in the context of the uses of the zext nneg.
2024-02-22	[HCS] Externd to outline overlapping sub/super cold regions (#80732)	Shimin Cui	7	-82/+212
	Currently, with hot cold splitting, when a cold region is identified, it is added to the region list of ColdBlocks. Then when another cold region (B) identified overlaps with a ColdBlocks region (A) already added to the list, the region B is not added to the list because of the overlapping with region A. The splitting analysis is performed, and the region A may not get split, for example, if it’s considered too expansive. This is to improve the handling the overlapping case when the region A is not considered good for splitting, while the region B is good for splitting. The change is to move the cold region splitting analysis earlier to allow more cold region splitting. If an identified region cannot be split, it will not be added to the candidate list of ColdBlocks for overlapping check.
2024-02-22	[RISCV] Add test case showing missed opportunity to form sextload when sext ↵	Craig Topper	1	-0/+84
	and zext nneg are both present. NFC
2024-02-22	[RISCV][LV] Add additional small trip count loop coverage	Philip Reames	1	-2/+366

2024-02-22	[LLVM][DebugInfo] Refactor some code for easier sharing. (#82153)	cmtice	2	-34/+65
	Refactor the code that calculates the offsets for the various pieces of the DWARF .debug_names index section, to make it easier to share the code with other tools, such as LLD.
2024-02-23	[RISCV][SDAG] Improve codegen of select with constants if zicond is ↵	Yingwei Zheng	2	-10/+262
	available (#82456) This patch uses `add + czero.eqz/nez` to lower select with constants if zicond is available. ``` (select c, c1, c2) -> (add (czero_nez c2 - c1, c), c1) (select c, c1, c2) -> (add (czero_eqz c1 - c2, c), c2) ``` The above code sequence is suggested by [RISCV Optimization Guide](https://riscv-optimization-guide-riseproject-c94355ae3e6872252baa952524.gitlab.io/riscv-optimization-guide.html#_avoid_branches_using_conditional_moves).
2024-02-22	[RISCV][AArch64] Add vscale_range attribute to tests per architecture minimums	Philip Reames	2	-51/+41
	Spent a bunch of time tracing down an odd issue "in SCEV" which turned out to be the fact that SCEV doesn't have access to TTI. As a result, the only way for it to get range facts on vscales (to avoid collapsing ranges of element counts and type sizes to trivial ranges on multiplies) is to look at the vscale_range attribute. Since vscale_range is set by clang by default, manually setting it in the tests shouldn't interfere with the test intent.
2024-02-22	LoopVectorize: add test for crash in #72969 (#74111)	Ramkumar Ramachandra	1	-0/+25

2024-02-22	[RemoveDIs][NFC] Add DPLabel class [2/3] (#82376)	Orlando Cazalet-Hyams	3	-13/+85
	Patch 2 of 3 to add llvm.dbg.label support to the RemoveDIs project. The patch stack adds the DPLabel class, which is the RemoveDIs llvm.dbg.label equivalent. 1. Add DbgRecord base class for DPValue and the not-yet-added DPLabel class. -> 2. Add the DPLabel class. 3. Enable dbg.label conversion and add support to passes. This will be used (and tested) in the final patch(es), coming next.
2024-02-22	[Flang][LLVM][OpenMP] Relax target data restrictions to be more inline with ↵	agozillon	1	-5/+3
	the specification (#82537) Currently we emit errors whenever a map is not provided on a target data directive, however, I believe that's incorrect behavior, the specification states: "At least one map, use_device_addr or use_device_ptr clause must appear on the directive" So provided one is present, the directive is legal in this case. Slightly different to its siblings (enter/exit/update) which don't have use_device_addr/use_device_ptr.
2024-02-22	[build] Check RUNTIMES_${target}_LLVM_ENABLE_RUNTIMES for libc also (#82561)	Petr Hosek	1	-1/+13
	When checking whether we need to build libc-hdrgen, we need to check LLVM_ENABLE_RUNTIMES and RUNTIMES_${target}_LLVM_ENABLE_RUNTIMES, just the former is not sufficient since libc may be enabled only for certain targets.
2024-02-22	[InstCombine] Pick bfloat over half when shrinking ops that started with an ↵	Benjamin Kramer	3	-9/+26
	fpext from bfloat (#82493) This fixes the case where we would shrink an frem to half and then bitcast to bfloat, producing invalid results. The transformation was written under the assumption that there is only one type with a given bit width. Also add a strategic assert to CastInst::CreateFPCast to turn this miscompilation into a crash.
2024-02-22	[OpenMP][FIX] Remove unsound omp_get_thread_limit deduplication (#79524)	Matt	2	-1/+59
	The deduplication of the calls to `omp_get_thread_limit` used to be legal when originally added in <https://github.com/llvm/llvm-project/commit/e28936f6137c5a9c4f7673e248c192a9811543b6#diff-de101c82aff66b2bda2d1f53fde3dde7b0d370f14f1ff37b7919ce38531230dfR123>, as the result (thread_limit) was immutable. However, now that we have `thread_limit` clause, we no longer have immutability; therefore `omp_get_thread_limit()` is not a deduplicable runtime call. Thus, removing `omp_get_thread_limit` from the `DeduplicableRuntimeCallIDs` array. Here's a simple example: ``` #include <omp.h> #include <stdio.h> int main() { #pragma omp target thread_limit(4) { printf("\n1:target thread_limit: %d\n", omp_get_thread_limit()); } #pragma omp target thread_limit(3) { printf("\n2:target thread_limit: %d\n", omp_get_thread_limit()); } return 0; } ``` GCC-compiled binary execution: https://gcc.godbolt.org/z/Pjv3TWoTq ``` 1:target thread_limit: 4 2:target thread_limit: 3 ``` Clang/LLVM-compiled binary execution: https://clang.godbolt.org/z/zdPbrdMPn ``` 1:target thread_limit: 4 2:target thread_limit: 4 ``` By my reading of the OpenMP spec GCC does the right thing here; cf. <https://www.openmp.org/spec-html/5.2/openmpse12.html#x34-330002.4>: > If a target construct with a thread_limit clause is encountered, the thread-limit-var ICV from the data environment of the generated initial task is instead set to an implementation deﬁned value between one and the value speciﬁed in the clause. The common subexpression elimination (CSE) of the second call to `omp_get_thread_limit` by LLVM does not seem to be correct, as it's not an available expression at any program point(s) (in the scope of the clause in question) after the second target construct with a `thread_limit` clause is encountered. Compiling with `-Rpass=openmp-opt -Rpass-analysis=openmp-opt -Rpass-missed=openmp-opt` we have: https://clang.godbolt.org/z/G7dfhP7jh ``` <source>:8:42: remark: OpenMP runtime call omp_get_thread_limit deduplicated. [OMP170] [-Rpass=openmp-opt] 8 \| printf("\n1:target thread_limit: %d\n",omp_get_thread_limit()); \| ^ ``` OMP170 has the following explanation: https://openmp.llvm.org/remarks/OMP170.html > This optimization remark indicates that a call to an OpenMP runtime call was replaced with the result of an existing one. This occurs when the compiler knows that the result of a runtime call is immutable. Removing duplicate calls is done by replacing all calls to that function with the result of the first call. This cannot be done automatically by the compiler because the implementations of the OpenMP runtime calls live in a separate library the compiler cannot see. This optimization will trigger for known OpenMP runtime calls whose return value will not change. At the same time I do not believe we have an analysis checking whether this precondition holds here: "This occurs when the compiler knows that the result of a runtime call is immutable." AFAICT, such analysis doesn't appear to exist in the original patch introducing deduplication, either: - https://github.com/llvm/llvm-project/commit/9548b74a831ea005649465797f359e0521f3b8a9 - https://reviews.llvm.org/D69930 The fix is to remove it from `DeduplicableRuntimeCallIDs`, effectively reverting the addition in this commit (noting that `omp_get_max_threads` is not present in `DeduplicableRuntimeCallIDs`, so it's possible this addition was incorrect in the first place): - [OpenMP][Opt] Annotate known runtime functions and deduplicate more, - https://github.com/llvm/llvm-project/commit/e28936f6137c5a9c4f7673e248c192a9811543b6#diff-de101c82aff66b2bda2d1f53fde3dde7b0d370f14f1ff37b7919ce38531230dfR123 As a result, we're no longer unsoundly deduplicating the OpenMP runtime call `omp_get_thread_limit` as illustrated by the test case: Note the (correctly) repeated `call i32 @omp_get_thread_limit()`. --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>
2024-02-22	[LLVM][IR] Add native vector support to ConstantInt & ConstantFP. (#74502)	Paul Walker	8	-39/+243
	NOTE: For brevity the following talks about ConstantInt but everything extends to cover ConstantFP as well. Whilst ConstantInt::get() supports the creation of vectors whereby each lane has the same value, it achieves this via other constants: * ConstantVector for fixed-length vectors * ConstantExprs for scalable vectors However, ConstantExprs are being deprecated and ConstantVector is not space efficient for larger vector types. By extending ConstantInt we can represent vector splats by only storing the underlying scalar value. More specifically: * ConstantInt gains an ElementCount variant of get(). * LLVMContext is extended to map <EC,APInt>->ConstantInt. * BitcodeReader/Writer support is extended to allow vector types. Whilst this patch adds the base support, more work is required before it's production ready. For example, there's likely to be many places where isa<ConstantInt> assumes a scalar type. Accordingly the default behaviour of ConstantInt::get() remains unchanged but a set of flags are added to allow wider testing and thus help with the migration: --use-constant-int-for-fixed-length-splat --use-constant-fp-for-fixed-length-splat --use-constant-int-for-scalable-splat --use-constant-fp-for-scalable-splat NOTE: No change is required to the bitcode format because types and values are handled separately. NOTE: For similar reasons as above, code generation doesn't work out-the-box.
2024-02-22	[AIX] Lower intrinsic __builtin_cpu_is into AIX platform-specific code. (#80069)	zhijian lin	1	-0/+57
	On AIX OS, __builtin_cpu_is() references the runtime external variable _system_configuration from /usr/include/sys/systemcfg.h. ref issue: https://github.com/llvm/llvm-project/issues/80042
2024-02-22	[LangRef] Document string literals in LLVM's format (#82529)	Ian Hickson	1	-1/+32

2024-02-22	[InferAddrSpaces] Correctly replace identical operands of insts (#82610)	Pierre van Houtryve	2	-5/+77
	It's important for PHI nodes because if a PHI node has multiple edges coming from the same block, we can have the same incoming value multiple times in the list of incoming values. All of those need to be consistent (exact same Value*) otherwise verifier complains. Fixes SWDEV-445797
2024-02-22	[CVP] Refactor `processMinMaxIntrinsic` to check non-strict predicate in ↵	Yingwei Zheng	2	-13/+76
	both directions (#82596) This patch uses `getConstantRangeAtUse` in `processMinMaxIntrinsic` to address the comment https://github.com/llvm/llvm-project/pull/82478#discussion_r1497300920. After this patch we can reuse the range result in https://github.com/llvm/llvm-project/pull/82478.
2024-02-22	[GlobalISel] Constant-fold G_PTR_ADD with different type sizes (#81473)	Pierre van Houtryve	2	-1/+44
	All other opcodes in the list are constrained to have the same type on both operands, but not G_PTR_ADD. Fixes #81464
2024-02-22	[AArch64] Remove unused ReverseCSRRestoreSeq option. (#82326)	Sander de Smalen	2	-146/+21
	This patch removes the `-reverse-csr-restore-seq` option from AArch64FrameLowering, since this is no longer used. This patch was reverted because of a crash in PR#79623. Merging it back as it was fixed in PR#82492.