aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/IR/Function.cpp
AgeCommit message (Collapse)AuthorFilesLines
2021-02-25Option to ignore llvm[.compiler].used uses in hasAddressTaken()Stanislav Mekhanoshin1-2/+19
Differential Revision: https://reviews.llvm.org/D96087
2021-02-25Option to ignore assume like intrinsic uses in hasAddressTaken()Stanislav Mekhanoshin1-2/+15
Differential Revision: https://reviews.llvm.org/D96081
2021-01-20Allow nonnull/align attribute to accept poisonJuneyoung Lee1-2/+4
Currently LLVM is relying on ValueTracking's `isKnownNonZero` to attach `nonnull`, which can return true when the value is poison. To make the semantics of `nonnull` consistent with the behavior of `isKnownNonZero`, this makes the semantics of `nonnull` to accept poison, and return poison if the input pointer isn't null. This makes many transformations like below legal: ``` %p = gep inbounds %x, 1 ; % p is non-null pointer or poison call void @f(%p) ; instcombine converts this to call void @f(nonnull %p) ``` Instead, this semantics makes propagation of `nonnull` to caller illegal. The reason is that, passing poison to `nonnull` does not immediately raise UB anymore, so such program is still well defined, if the callee does not use the argument. Having `noundef` attribute there re-allows this. ``` define void @f(i8* %p) { ; functionattr cannot mark %p nonnull here anymore call void @g(i8* nonnull %p) ; .. because @g never raises UB if it never uses %p. ret void } ``` Another attribute that needs to be updated is `align`. This patch updates the semantics of align to accept poison as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90529
2020-12-30[X86] Add x86_amx type for intel AMX.Luo, Yuanke1-1/+8
The x86_amx is used for AMX intrisics. <256 x i32> is bitcast to x86_amx when it is used by AMX intrinsics, and x86_amx is bitcast to <256 x i32> when it is used by load/store instruction. So amx intrinsics only operate on type x86_amx. It can help to separate amx intrinsics from llvm IR instructions (+-*/). Thank Craig for the idea. This patch depend on https://reviews.llvm.org/D87981. Differential Revision: https://reviews.llvm.org/D91927
2020-12-02Small improvements to Intrinsic::getNameXun Li1-1/+3
While I was adding a new intrinsic instruction (not overloaded), I accidentally used CreateUnaryIntrinsic to create the intrinsics, which turns out to be passing the type list to getName, and ended up naming the intrinsics function with type suffix, which leads to wierd bugs latter on. It took me a long time to debug. It seems a good idea to add an assertion in getName so that it fails if types are passed but it's not a overloaded function. Also, the overloade version of getName is less efficient because it creates an std::string. We should avoid calling it if we know that there are no types provided. Differential Revision: https://reviews.llvm.org/D92523
2020-12-02[Inline] prevent inlining on stack protector mismatchNick Desaulniers1-0/+6
It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an attribute((no_stack_protector)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u SSP attributes can be ordered by strength. Weakest to strongest, they are: ssp, sspstrong, sspreq. Callees with differing SSP attributes may be inlined into each other, and the strongest attribute will be applied to the caller. (No change) After this change: * A callee with no SSP attributes will no longer be inlined into a caller with SSP attributes. * The reverse is also true: a callee with an SSP attribute will not be inlined into a caller with no SSP attributes. * The alwaysinline attribute overrides these rules. Functions that get synthesized by the compiler may not get inlined as a result if they are not created with the same stack protector function attribute as their callers. Alternative approach to https://reviews.llvm.org/D87956. Fixes pr/47479. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed By: rnk, MaskRay Differential Revision: https://reviews.llvm.org/D91816
2020-11-13[VE] Support vld intrinsicsKazushi (Jam) Marukawa1-0/+1
Add intrinsics for vector load instructions. Add a regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D91332
2020-11-03[CostModel] Make target intrinsics cheap by defaultDavid Green1-1/+5
This patch changes the intrinsics cost model to assume that by default target intrinsics are cheap. This didn't seem to be the case for all intrinsics, and is potentially an MVE problem due to our scalarization overheads. Cheap seems to be a good default in general though. Differential Revision: https://reviews.llvm.org/D90597
2020-10-13[PowerPC] Add assemble disassemble intrinsics for MMAAhsan Saghir1-1/+6
This patch adds support for assemble disassemble intrinsics for MMA. Reviewed By: bsaleil, #powerpc Differential Revision: https://reviews.llvm.org/D88739
2020-09-30[X86] Support Intel Key LockerXiang1 Zhang1-1/+3
Key Locker provides a mechanism to encrypt and decrypt data with an AES key without having access to the raw key value by converting AES keys into “handles”. These handles can be used to perform the same encryption and decryption operations as the original AES keys, but they only work on the current system and only until they are revoked. If software revokes Key Locker handles (e.g., on a reboot), then any previous handles can no longer be used. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D88398
2020-09-24OpaquePtr: Add helpers for sret to mirror byvalMatt Arsenault1-3/+10
Sret should really have a type parameter like byval does.
2020-09-18IR: Move denormal mode parsing from MachineFunction to FunctionMatt Arsenault1-0/+15
This was just inspecting the IR to begin with, and is useful to check in some places in the IR.
2020-09-16[llvm][CodeGen] Do not scalarize `llvm.masked.[gather|scatter]` operating on ↵Francesco Petrogalli1-2/+1
scalable vectors. This patch prevents the `llvm.masked.gather` and `llvm.masked.scatter` intrinsics to be scalarized when invoked on scalable vectors. The change in `Function.cpp` is needed to prevent the warning that is raised when `getNumElements` is used in place of `getElementCount` on `VectorType` instances. The tests guards for regressions on this change. The tests makes sure that calls to `llvm.masked.[gather|scatter]` are still scalarized when: # the intrinsics are operating on fixed size vectors, and # the compiler is not targeting fixed length SVE code generation. Reviewed By: efriedma, sdesmalen Differential Revision: https://reviews.llvm.org/D86249
2020-08-28[SVE] Make ElementCount members privateDavid Sherwood1-2/+3
This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065
2020-08-27[SVE] Remove calls to VectorType::getNumElements from IRChristopher Tetreault1-3/+4
Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D81500
2020-07-22[InstCombine] Move target-specific inst combiningSebastian Neubauer1-0/+4
For a long time, the InstCombine pass handled target specific intrinsics. Having target specific code in general passes was noted as an area for improvement for a long time. D81728 moves most target specific code out of the InstCombine pass. Applying the target specific combinations in an extra pass would probably result in inferior optimizations compared to the current fixed-point iteration, therefore the InstCombine pass resorts to newly introduced functions in the TargetTransformInfo when it encounters unknown intrinsics. The patch should not have any effect on generated code (under the assumption that code never uses intrinsics from a foreign target). This introduces three new functions: TargetTransformInfo::instCombineIntrinsic TargetTransformInfo::simplifyDemandedUseBitsIntrinsic TargetTransformInfo::simplifyDemandedVectorEltsIntrinsic A few target specific parts are left in the InstCombine folder, where it makes sense to share code. The largest left-over part in InstCombineCalls.cpp is the code shared between arm and aarch64. This allows to move about 3000 lines out from InstCombine to the targets. Differential Revision: https://reviews.llvm.org/D81728
2020-07-20IR: Define byref parameter attributeMatt Arsenault1-6/+42
This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some of the stack-copy implications in its definition. This includes the base IR changes, and some tests for places where it should be treated similarly to byval. Codegen support will be in a future patch. My original attempt at solving some of these problems was to repurpose byval with a different address space from the stack. However, it is technically permitted for the callee to introduce a write to the argument, although nothing does this in reality. There is also talk of removing and replacing the byval attribute, so a new attribute would need to take its place anyway. This is intended avoid some optimization issues with the current handling of aggregate arguments, as well as fixes inflexibilty in how frontends can specify the kernel ABI. The most honest representation of the amdgpu_kernel convention is to expose all kernel arguments as loads from constant memory. Today, these are raw, SSA Argument values and codegen is responsible for turning these into loads. Background: There currently isn't a satisfactory way to represent how arguments for the amdgpu_kernel calling convention are passed. In reality, arguments are passed in a single, flat, constant memory buffer implicitly passed to the function. It is also illegal to call this function in the IR, and this is only ever invoked by a driver of some kind. It does not make sense to have a stack passed parameter in this context as is implied by byval. It is never valid to write to the kernel arguments, as this would corrupt the inputs seen by other dispatches of the kernel. These argumets are also not in the same address space as the stack, so a copy is needed to an alloca. From a source C-like language, the kernel parameters are invisible. Semantically, a copy is always required from the constant argument memory to a mutable variable. The current clang calling convention lowering emits raw values, including aggregates into the function argument list, since using byval would not make sense. This has some unfortunate consequences for the optimizer. In the aggregate case, we end up with an aggregate store to alloca, which both SROA and instcombine turn into a store of each aggregate field. The optimizer never pieces this back together to see that this is really just a copy from constant memory, so we end up stuck with expensive stack usage. This also means the backend dictates the alignment of arguments, and arbitrarily picks the LLVM IR ABI type alignment. By allowing an explicit alignment, frontends can make better decisions. For example, there's real no advantage to an aligment higher than 4, so a frontend could choose to compact the argument layout. Similarly, there is a high penalty to using an alignment lower than 4, so a frontend could opt into more padding for small arguments. Another design consideration is when it is appropriate to expose the fact that these arguments are all really passed in adjacent memory. Currently we have a late IR optimization pass in codegen to rewrite the kernel argument values into explicit loads to enable vectorization. In most programs, unrelated argument loads can be merged together. However, exposing this property directly from the frontend has some disadvantages. We still need a way to track the original argument sizes and alignments to report to the driver. I find using some side-channel, metadata mechanism to track this unappealing. If the kernel arguments were exposed as a single buffer to begin with, alias analysis would be unaware that the padding bits betewen arguments are meaningless. Another family of problems is there are still some gaps in replacing all of the available parameter attributes with metadata equivalents once lowered to loads. The immediate plan is to start using this new attribute to handle all aggregate argumets for kernels. Long term, it makes sense to migrate all kernel arguments, including scalars, to be passed indirectly in the same manner. Additional context is in D79744.
2020-07-16IR: Rename Argument::hasPassPointeeByValueAttr to prepare for byrefMatt Arsenault1-1/+1
When the byref attribute is added, there will need to be two similar functions for the existing cases which have an associate value copy, and byref which does not. Most, but not all of the existing uses will use the existing version. The associated size function added by D82679 also needs to contextually differ, and will help eliminate a few places still relying on pointee element types.
2020-07-14[CallGraph] Ignore callback usesGiorgis Georgakoudis1-2/+12
Summary: Ignore callback uses when adding a callback function in the CallGraph. Callback functions are typically created when outlining, e.g. for OpenMP, so they have internal scope and linkage. They should not be added to the ExternalCallingNode since they are only callable by the specified caller function at creation time. A CGSCC pass, such as OpenMPOpt, may need to update the CallGraph by adding a new outlined callback function. Without ignoring callback uses, adding breaks CGSCC pass restrictions and results to a broken CallGraph. Reviewers: jdoerfert Subscribers: hiraditya, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83370
2020-07-10Revert "[CallGraph] Ignore callback uses"Roman Lebedev1-9/+2
This likely has broken test/Transforms/Attributor/IPConstantProp/ tests. http://45.33.8.238/linux/22502/step_12.txt This reverts commit 205dc0922d5f7305226f7457fcbcb4224c92530c.
2020-07-09[CallGraph] Ignore callback usesGiorgis Georgakoudis1-2/+9
Summary: Ignore callback uses when adding a callback function in the CallGraph. Callback functions are typically created when outlining, e.g. for OpenMP, so they have internal scope and linkage. They should not be added to the ExternalCallingNode since they are only callable by the specified caller function at creation time. A CGSCC pass, such as OpenMPOpt, may need to update the CallGraph by adding a new outlined callback function. Without ignoring callback uses, adding breaks CGSCC pass restrictions and results to a broken CallGraph. Reviewers: jdoerfert Subscribers: hiraditya, sstefan1, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83370
2020-07-09OpaquePtr: Don't check pointee type for byval/preallocatedMatt Arsenault1-0/+21
Since none of these users really care about the actual type, hide the type under a new size-getting attribute to go along with hasPassPointeeByValueAttr. This will work better for the future byref attribute, which may end up only tracking the byte size and not the IR type. We currently have 3 parameter attributes that should carry the type (technically inalloca does not yet). The APIs are somewhat awkward since preallocated/inalloca piggyback on byval in some places, but in others are treated as distinct attributes. Since these are all mutually exclusive, we should probably just merge all the attribute infrastructure treating these as totally distinct attributes.
2020-06-29Silence unused var warning in NDEBUG buildReid Kleckner1-2/+2
2020-06-29Add intrinsic helper functionSebastian Neubauer1-14/+23
It simplifies getting generic argument types from intrinsics. Differential Revision: https://reviews.llvm.org/D81084
2020-06-04[SVE] Fix ubsan issues in DecodeIITTypeDavid Sherwood1-21/+18
In an earlier patch I removed the need for IITDescriptor::ScalableVecArgument, which involved changing DecodeIITType to pull out the last IIT_Info from the list. However, it turns out this is unsafe and causes ubsan failures. I've tried to fix this a different way by simply passing the last IIT_Info as an additional argument to DecodeIITType. Differential Revision: https://reviews.llvm.org/D81057
2020-05-27[IR][BFloat] add BFloat IR intrinsics supportTies Stuij1-1/+7
Summary: This patch is part of a series that adds support for the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile Reviewers: scanon, fpetrogalli, sdesmalen, craig.topper, LukeGeeson Reviewed By: fpetrogalli Subscribers: LukeGeeson, pbarrio, kristof.beyls, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79707
2020-05-21[SVE] Remove IITDescriptor::ScalableVecArgumentDavid Sherwood1-31/+20
I have refactored the code so that we no longer need the ScalableVecArgument descriptor - the scalable property of vectors is now encoded using the ElementCount class in IITDescriptor. This means that when matching intrinsics we know precisely how to match the arguments and return values. Differential Revision: https://reviews.llvm.org/D80107
2020-05-20Reland [X86] Codegen for preallocatedArthur Eubanks1-0/+6
See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689
2020-05-20Revert "[X86] Codegen for preallocated"Arthur Eubanks1-6/+0
This reverts commit 810567dc691a57c8c13fef06368d7549f7d9c064. Some tests are unexpectedly passing
2020-05-20[X86] Codegen for preallocatedArthur Eubanks1-0/+6
See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689
2020-05-15[IR] Convert null-pointer-is-valid into an enum attributeNikita Popov1-3/+1
The "null-pointer-is-valid" attribute needs to be checked by many pointer-related combines. To make the check more efficient, convert it from a string into an enum attribute. In the future, this attribute may be replaced with data layout properties. Differential Revision: https://reviews.llvm.org/D78862
2020-05-15[IR][BFloat] Add BFloat IR typeTies Stuij1-0/+1
Summary: The BFloat IR type is introduced to provide support for, initially, the BFloat16 datatype introduced with the Armv8.6 architecture (optional from Armv8.2 onwards). It has an 8-bit exponent and a 7-bit mantissa and behaves like an IEEE 754 floating point IR type. This is part of a patch series upstreaming Armv8.6 features. Subsequent patches will upstream intrinsics support and C-lang support for BFloat. Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, sdesmalen, deadalnix, ctetreau Subscribers: hiraditya, llvm-commits, danielkiss, arphaman, kristof.beyls, dexonsmith Tags: #llvm Differential Revision: https://reviews.llvm.org/D78190
2020-05-15[SVE] Fix wrong usage of getNumElements() in matchIntrinsicTypeDavid Sherwood1-2/+8
I have changed the ScalableVecArgument case in matchIntrinsicType to create a new FixedVectorType. This means that the next case we hit (Vector) will not assert when calling getNumElements(), since we know that it's always a FixedVectorType. This is a temporary measure for now, and it will be fixed properly in another patch that refactors this code. The changes are covered by this existing test: CodeGen/AArch64/sve-intrinsics-fp-converts.ll In addition, I have added a new test to ensure that we correctly reject SVE intrinsics when called with fixed length vector types. Differential Revision: https://reviews.llvm.org/D79416
2020-04-30[NFC] Rename *ByValOrInalloca* to *PassPointeeByValue*Arthur Eubanks1-2/+3
Summary: In preparation for preallocated. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79152
2020-04-28[SVE] Remove invalid usage of VectorType::getNumElements in FunctionChristopher Tetreault1-2/+2
Summary: Removes usage of VectorType::getNumElements identified by test located at CodeGen/aarch64-sve-intrinsics/acle_sve_dot.c. This code explicitly converts a potentially fixed length vector to scalable vector by constructing the ElementCount = {getNumElements(), true} Reviewers: rengolin, efriedma, kmclaughlin, c-rhodes, sdesmalen Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78967
2020-04-23[SVE] Remove calls to isScalable from IRChristopher Tetreault1-3/+2
Reviewers: efriedma, sdesmalen, dexonsmith, dblaikie Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77691
2020-04-23[SVE] Make VectorType::getNumElements() complain for scalable vectorsChristopher Tetreault1-3/+3
Summary: Piggy-back off of TypeSize's STRICT_FIXED_SIZE_VECTORS flag and: - if it is defined, assert that the vector is not scalable - if it is not defined, complain if the vector is scalable Reviewers: efriedma, sdesmalen, c-rhodes Reviewed By: sdesmalen Subscribers: hiraditya, mgorny, tschuett, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78576
2020-04-13[SVE] Change return type of getNumElements to unsignedChristopher Tetreault1-2/+1
Reviewers: efriedma, sdesmalen, craig.topper, dexonsmith Reviewed By: efriedma, sdesmalen Subscribers: tschuett, hiraditya, rkruppe, psnobl, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, grosul1, frgossen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77763
2020-04-10Clean up usages of asserting vector getters in TypeChristopher Tetreault1-12/+10
Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: dexonsmith, sdesmalen, efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77276
2020-04-10[FPEnv][AArch64] Platform-specific builtin constrained FP enablementKevin P. Neal1-0/+12
When constrained floating point is enabled the AArch64-specific builtins don't use constrained intrinsics in some cases. Fix that. Neon is part of this patch, so ARM is affected as well. Differential Revision: https://reviews.llvm.org/D77074
2020-02-19Add <128 x i1> as an intrinsic typeKrzysztof Parzyszek1-1/+6
2019-12-17Resubmit "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove"Guillaume Chatelet1-0/+5
Summary: This is a resubmit of D71473. This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: aaron.ballman, courbet Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71547
2019-12-16Revert "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove"Guillaume Chatelet1-5/+0
This reverts commit 181ab91efc9fb08dedda10a2fbc5fccb83ce8799.
2019-12-16[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMoveGuillaume Chatelet1-0/+5
Summary: This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71473
2019-12-11[IR] Split out target specific intrinsic enums into separate headersReid Kleckner1-3/+17
This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320
2019-11-06Keep import function list for inlinee profile updateWenlei He1-0/+5
Summary: When adjusting function entry counts after inlining, Funciton::setEntryCount is called without providing an import function list. The side effect of that is the previously set import function list will be dropped. The import function list is used by ThinLTO to help import hot cross module callee for LTO inlining, so dropping that during ThinLTO pre-link may adversely affect LTO inlining. The fix is to keep the list while updating entry counts for inlining. Reviewers: wmi, davidxl, tejohnson Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69736
2019-10-02[IntrinsicEmitter] Add overloaded type VecOfBitcastsToInt for SVE intrinsicsKerry McLaughlin1-1/+23
Summary: This allows intrinsics such as the following to be defined: - declare <n x 4 x i32> @llvm.something.nxv4f32(<n x 4 x i32>, <n x 4 x i1>, <n x 4 x float>) ...where <n x 4 x i32> is derived from <n x 4 x float>, but the element needs bitcasting to int. Reviewers: c-rhodes, sdesmalen, rovka Reviewed By: c-rhodes Subscribers: tschuett, hiraditya, jdoerfert, llvm-commits, cfe-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68021 llvm-svn: 373437
2019-09-30[AArch64][SVE] Implement punpk[hi|lo] intrinsicsKerry McLaughlin1-2/+3
Summary: Adds the following two intrinsics: - int_aarch64_sve_punpkhi - int_aarch64_sve_punpklo This patch also contains a fix which allows LLVMHalfElementsVectorType to forward reference overloadable arguments. Reviewers: sdesmalen, rovka, rengolin Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, greened, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67830 llvm-svn: 373232
2019-09-23Function::BuildLazyArguments() - fix "variable used but never read" analyzer ↵Simon Pilgrim1-1/+2
warning. NFCI. Simplify the code by separating the masking of the SDC variable from using it. llvm-svn: 372598
2019-09-20[IntrinsicEmitter] Add overloaded types for SVE intrinsics (Subdivide2 & ↵Kerry McLaughlin1-1/+37
Subdivide4) Summary: Both match the type of another intrinsic parameter of a vector type, but where each element is subdivided to form a vector with more elements of a smaller type. Subdivide2Argument allows intrinsics such as the following to be defined: - declare <vscale x 4 x i32> @llvm.something.nxv4i32(<vscale x 8 x i16>) Subdivide4Argument allows intrinsics such as: - declare <vscale x 4 x i32> @llvm.something.nxv4i32(<vscale x 16 x i8>) Tests are included in follow up patches which add intrinsics using these types. Reviewers: sdesmalen, SjoerdMeijer, greened, rovka Reviewed By: sdesmalen Subscribers: rovka, tschuett, jdoerfert, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67549 llvm-svn: 372380