aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/AtomicExpandPass.cpp
AgeCommit message (Collapse)AuthorFilesLines
2022-11-18[X86] Use lock add/sub for cases that we only care about the EFLAGSPhoebe Wang1-0/+4
This fixes #36373, #36905 and partial of #58685. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D137711
2022-11-10AtomicExpand: Support cmpxchg expansion for small FP typesMatt Arsenault1-11/+18
Handles f16 atomics for AMDGPU.
2022-11-04[LLVM][AMDGPU] Specialize 32-bit atomic fadd instruction for generic address ↵Shilei Tian1-0/+3
space The 32-bit floating-point atomic add instructions on AMDGPUs does not support a "flat" or "generic" address space. So, if the address space cannot be determined statically, the AMDGPU backend will fall back to a CAS loop (which does support "flat" addressing). Instead, this patch emits runtime address-space checks to allow native FP atomic add instructions for global and LDS memory (and non-atomic FP add instructions for private/scratch memory). In order to do that, this patch introduces a new interface function `emitExpandAtomicRMW`. It is expected to be called when a common atomic expand doesn't work for a specific target, such as the case we discussed here. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D129690
2022-10-31AtomicExpand: Use InstSimplifyFolderMatt Arsenault1-49/+54
Automatically cleanup operations if we know the atomic has higher alignment.
2022-10-31AtomicExpand: Don't create unused instructions for some atomicrmwMatt Arsenault1-3/+9
This wasn't used by every atomicrmw expansion.
2022-10-13AtomicExpand: Avoid some operations if the atomic is overalignedMatt Arsenault1-15/+22
Let some of the pointer bithacking fold away if we know the LSB are 0.
2022-09-28AtomicExpand: Use llvm.ptrmask instead of ptrtointMatt Arsenault1-5/+13
This removes the ptrtoint from the load's pointer operand, although we can't entirely eliminate these to get the LSB shift. In a future patch, this will avoid ptrtoint in the case where the atomic is overaligned to the word size.
2022-09-20AtomicExpand: Use correct pointer size for integerMatt Arsenault1-3/+4
This was using the default address space.
2022-09-07[AtomicExpandPass] Always copy pcsections Metadata to expanded atomicsMarco Elver1-17/+24
When expanding IR atomics to target-specific atomics, copy all !pcsections Metadata to expanded atomics automatically. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130885
2022-08-31[AtomicExpand] Make floating point conversion happens before fence insertionKai Luo1-48/+30
IIUC, the conversion part is not part of atomic operations and fences should be put around converted atomic operations. This also fixes atomic load of floating point values which requires fence on PowerPC. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D127609
2022-07-25[IRBuilder] Add assert for AtomicRMW orderingAlexander Shaposhnikov1-1/+6
Add assert for AtomicRMW: Ordering != AtomicOrdering::Unordered (https://github.com/llvm/llvm-project/blob/main/llvm/lib/IR/Verifier.cpp#L3944) and adjust expandAtomicStore accordingly. Test plan: 1/ ninja check-llvm check-clang check-lld 2/ Bootstrapped LLVM/Clang pass tests Differential revision: https://reviews.llvm.org/D130457
2022-07-17[CodeGen] Qualify auto variables in for loops (NFC)Kazu Hirata1-3/+3
2022-07-06[LLVM] Add the support for fmax and fmin in atomicrmw instructionShilei Tian1-0/+2
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw` instruction. For now (at least in this patch), the instruction will be expanded to CAS loop. There are already a couple of targets supporting the feature. I'll create another patch(es) to enable them accordingly. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D127041
2022-05-25Allow pointer types for atomicrmw xchgTakafumi Arakaki1-2/+7
This adds support for pointer types for `atomic xchg` and let us write instructions such as `atomicrmw xchg i64** %0, i64* %1 seq_cst`. This is similar to the patch for allowing atomicrmw xchg on floating point types: https://reviews.llvm.org/D52416. Differential Revision: https://reviews.llvm.org/D124728
2022-05-20[LLVM] Add a check if should cast atomic operations to integer typeShilei Tian1-4/+6
Currently for atomic load, store, and rmw instructions, as long as the operand is floating-point value, they are casted to integer. Nowadays many targets can actually support part of atomic operations with floating-point operands. For example, NVPTX supports atomic load and store of floating-point values. This patch adds a series interface functions `shouldCastAtomicXXXInIR`, and the default implementations are same as what we currently do. Later for targets can have their specialization. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D125652
2022-04-08Transforms: Fix code duplication between LowerAtomic and AtomicExpandMatt Arsenault1-47/+6
2022-04-06AtomicExpand: Add NotAtomic lowering strategyMatt Arsenault1-0/+11
Currently LowerAtomics exists as a separate pass which blindly replaces all atomics. Add a new lowering strategy option to eliminate the atomics which the target can control on a per-instruction level.
2022-04-06AtomicExpand: Change return type for shouldExpandAtomicStoreInIRMatt Arsenault1-3/+14
Use the same enum as the other atomic instructions for consistency, in preparation for addition of another strategy. Introduce a new "Expand" option, since the store expansion does not use cmpxchg. Alternatively, the existing CmpXChg strategy could be renamed to Expand.
2022-03-18Fix computation of MadeChange bit in AtomicExpandPass.Eli Friedman1-5/+7
Fixes llvm-clang-x86_64-expensive-checks-debian failure with 2f497ec3. expandAtomicStore always modifies the function, so make sure we set MadeChange unconditionally. Not sure how nobody else has stumbled over this before.
2022-03-18[AtomicExpand][PowerPC] Fix all-one mask valueKai Luo1-1/+1
When generating a all-one mask value whose bitwidth is larger than 64, signed extension should be used rather then zero extension. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D120865
2022-03-17[AtomicExpandPass][NFC] Reformat with clang-formatMarco Elver1-108/+112
NFCI.
2022-03-16Cleanup codegen includesserge-sans-paille1-1/+1
This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10Revert "Cleanup codegen includes"Nico Weber1-1/+1
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10Cleanup codegen includesserge-sans-paille1-1/+1
after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2022-03-01[X86] Use bit test instructions to optimize some logic atomic operationsPhoebe Wang1-0/+4
This is to match GCC's optimizations: https://gcc.godbolt.org/z/3odh9e7WE Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120199
2021-11-14[llvm] Use range-based for loops with instructions (NFC)Kazu Hirata1-5/+3
2021-08-19[Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware ↵Anshil Gandhi1-1/+1
instructions Produce remarks when atomic instructions are expanded into hardware instructions in SIISelLowering.cpp. Currently, these remarks are only emitted for atomic fadd instructions. Differential Revision: https://reviews.llvm.org/D108150
2021-08-17[NFC] Cleanup more AttributeList::addAttribute()Arthur Eubanks1-1/+1
2021-08-16[Remarks] Emit optimization remarks for atomics generating CAS loopAnshil Gandhi1-1/+16
Implements ORE in AtomicExpand pass to report atomics generating a compare and swap loop. Differential Revision: https://reviews.llvm.org/D106891
2021-08-15Revert "[Remarks] Emit optimization remarks for atomics generating CAS loop"Dávid Bolvanský1-22/+1
This reverts commit 435785214f73ff0c92e97f2ade6356e3ba3bf661. Still same compile time issues for -O0 -g, eg. +1.3% for sqlite3.
2021-08-14[Remarks] Emit optimization remarks for atomics generating CAS loopAnshil Gandhi1-1/+22
Implements ORE in AtomicExpand pass to report atomics generating a compare and swap loop. Differential Revision: https://reviews.llvm.org/D106891
2021-08-13Revert "[Remarks] Emit optimization remarks for atomics generating CAS loop"Anshil Gandhi1-22/+1
This reverts commit c4e5425aa579d21530ef1766d7144b38a347f247.
2021-08-13[Remarks] Emit optimization remarks for atomics generating CAS loopAnshil Gandhi1-1/+22
Implements ORE in AtomicExpandPass to report atomics generating a compare and swap loop. Differential Revision: https://reviews.llvm.org/D106891
2021-07-15[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpandKai Luo1-0/+2
This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D103614
2021-07-02[OpaquePtr] Add type parameter to emitLoadLinkedKrzysztof Parzyszek1-5/+6
Differential Revision: https://reviews.llvm.org/D105353
2021-06-03[AtomicExpand] Merge cmpxchg success and failure ordering when appropriate.Eli Friedman1-5/+7
If we're not emitting separate fences for the success/failure cases, we need to pass the merged ordering to the target so it can emit the correct instructions. For the PowerPC testcase, we end up with extra fences, but that seems like an improvement over missing fences. If someone wants to improve that, the PowerPC backed could be taught to emit the fences after isel, instead of depending on fences emitted by AtomicExpand. Fixes https://bugs.llvm.org/show_bug.cgi?id=33332 . Differential Revision: https://reviews.llvm.org/D103342
2021-05-31[OpaquePtr] Remove some uses of PointerType::getElementType()Arthur Eubanks1-1/+1
2021-05-29[AtomicExpandPass][AArch64] Promote xchg with floating-point types to ↵LemonBoy1-1/+37
integer ones Follow the same strategy used for atomic loads/stores by converting the operands to equally-sized integer types. This change prevents the atomic expansion pass from generating illegal LL/SC pairs when targeting AArch64: `expand-atomicrmw-xchg-fp.ll` would previously instantiate intrinsics such as `llvm.aarch64.ldaxr.p0f32` that cannot be lowered. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D103232
2021-04-05Copy syncscope when expanding atomicrmw into cmpxchg loopStanislav Mekhanoshin1-10/+12
Fixes: SWDEV-280070 Differential Revision: https://reviews.llvm.org/D99902
2021-02-26Fix assert to use getTypeStoreSize instead of getPrimitiveSizeInBits,James Y Knight1-1/+2
per comment on D97223.
2021-02-25Add Alignment argument to IRBuilder CreateAtomicRMW and CreateAtomicCmpXchg.James Y Knight1-60/+70
And then push those change throughout LLVM. Keep the old signature in Clang's CGBuilder for now -- that will be updated in a follow-on patch (D97224). The MLIR LLVM-IR dialect is not updated to support the new alignment attribute, but preserves its existing behavior. Differential Revision: https://reviews.llvm.org/D97223
2020-11-02[AtomicExpand] Avoid creating an unnamed libcallAlex Richardson1-6/+11
I recently modified this pass to better support CHERI-RISC-V and while doing so I noticed that this pass was calling M->getOrInsertFunction() with the result of TLI->getLibcallName(RTLibType). However, AMDGPU fills the libcalls array with nullptr, so this creates an anonymous function instead. This patch changes expandAtomicOpToLibcall to return false in case the libcall does not exist and changes the assert() in the callees to a report_fatal_error() instead. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88800
2020-07-30Align store conditional addressBrendon Cahoon1-1/+2
In cases where the alignment of the datatype is smaller than expected by the instruction, the address is aligned. The aligned address is used for the load, but wasn't used for the store conditional, which resulted in a run-time alignment exception.
2020-07-09Fix return status of AtomicExpandPassserge-sans-paille1-3/+4
Correctly reflect change in the return status. Differential Revision: https://reviews.llvm.org/D83457
2020-06-30[Alignment][NFC] Migrate AtomicExpandPass to AlignGuillaume Chatelet1-51/+15
This is a followup on D78403. I'm unsure about `getAtomicOpAlign` overloads that take `AtomicRMWInst` and `AtomicCmpXchgInst`, shouldn't `getAlign` provide the correct answer already? Differential Revision: https://reviews.llvm.org/D81369
2020-04-28Handle part-word LL/SC in atomic expansion passKrzysztof Parzyszek1-98/+162
Differential Revision: https://reviews.llvm.org/D77213
2020-04-06[NFC] Modernize misc. uses of Align/MaybeAlign APIs.Eli Friedman1-3/+3
Use the current getAlign() APIs where it makes sense, and use Align instead of MaybeAlign when we know the value is non-zero.
2020-01-23[Alignement][NFC] Deprecate untyped CreateAlignedLoadGuillaume Chatelet1-4/+4
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73260
2019-11-13Sink all InitializePasses.h includesReid Kleckner1-0/+1
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211
2019-11-05[AtomicExpandPass] Silence static analyzer warnings about operator priority. ↵Dávid Bolvanský1-1/+1
NFCI.