aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib
AgeCommit message (Collapse)AuthorFilesLines
2022-01-13Don't override __attribute__((no_stack_protector)) by inlining (PR52886)Hans Wennborg2-9/+6
Since 26c6a3e736d3, LLVM's inliner will "upgrade" the caller's stack protector attribute based on the callee. This lead to surprising results with Clang's no_stack_protector attribute added in 4fbf84c1732f (D46300). Consider the following code compiled with clang -fstack-protector-strong -Os (https://godbolt.org/z/7s3rW7a1q). extern void h(int* p); inline __attribute__((always_inline)) int g() { return 0; } int __attribute__((__no_stack_protector__)) f() { int a[1]; h(a); return g(); } LLVM will inline g() into f(), and f() would get a stack protector, against the users explicit wishes, potentially breaking the program e.g. if h() changes the value of the stack cookie. That's a miscompile. More recently, bc044a88ee3c (D91816) addressed this problem by preventing inlining when the stack protector is disabled in the caller and enabled in the callee or vice versa. However, the problem remained if the callee is marked always_inline as in the example above. This affected users, see e.g. http://crbug.com/1274129 and http://llvm.org/pr52886. One way to fix this would be to prevent inlining also in the always_inline case. Despite the name, always_inline does not guarantee inlining, so this would be legal but potentially surprising to users. However, I think the better fix is to not enable the stack protector in a caller based on the callee. The motivation for the old behaviour is unclear, it seems counter-intuitive, and causes real problems as we've seen. This commit implements that fix, which means in the example above, g() gets inlined into f() (also without always_inline), and f() is emitted without stack protector. I think that matches most developers' expectations, and that's also what GCC does. Another effect of this change is that a no_stack_protector function can now be inlined into a stack protected function, e.g. (https://godbolt.org/z/hafP6W856): extern void h(int* p); inline int __attribute__((__no_stack_protector__)) __attribute__((always_inline)) g() { return 0; } int f() { int a[1]; h(a); return g(); } I think that's fine. Such code would be unusual since no_stack_protector is normally applied to a program entry point which sets up the stack canary. And even if such code exists, inlining doesn't change the semantics: there is still no stack cookie setup/check around entry/exit of the g() code region, but there may be in the surrounding context, as there was before inlining. This also matches GCC. See also the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722 Differential revision: https://reviews.llvm.org/D116589
2022-01-13[Docs] Fix IR and TableGen grammar inconsistenciesSebastian Neubauer1-3/+3
IR: - globals (and functions, ifuncs, aliases) can have a partition - catchret has a `to` before the label - the sint/int types do not exist - signext comes after the type - a variable was missing its type TableGen: - The second value after a `#` concatenation is optional See e.g. llvm/lib/Target/X86/X86InstrAVX512.td:L3351 - IncludeDirective and PreprocessorDirective were never referenced in the grammar - Add some missing ; - Parent classes of multiclasses can have generic arguments. Reuse the `ParentClassList` that is already used in other places. MIR: - liveins only allows physical registers, which start with a $ Differential Revision: https://reviews.llvm.org/D116674
2022-01-13[ARM] fix bug causing shrinkwrapping not always being off using PACTies Stuij1-1/+1
If you want to check for all uses of PAC, the SpillsLR argument to shouldSignReturnAddress should be true instead of false, as that value will be returned from the function if the other checks fall through. Reviewed By: miyuki Differential Revision: https://reviews.llvm.org/D116213
2022-01-13[GlobalOpt] Fix global to select transform under opaque pointersNikita Popov1-1/+4
We need to check that the load/store type is also the same, as this is no longer implicitly checked through the pointer type.
2022-01-13[WebAssembly] Fix reftype load/store match with idx from callPaulo Matos1-2/+1
Implement support for matching an index from a WebAssembly CALL instruction. Add test. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D115327
2022-01-13[FileCheck] Allow literal '['s before "[[var...]]"Jay Foad1-4/+7
Change FileCheck to accept patterns like "[[[var...]]" and treat the excess open brackets at the start as literals. This makes the patterns for matching assembler output with literal brackets much cleaner. For example an AMDGPU pattern that used to be written like: buffer_store_dwordx2 v{{\[}}[[LO]]:[[HI]]{{\]}} can now be: buffer_store_dwordx2 v[[[LO]]:[[HI]]] (Even before this patch the final close bracket did not need to be wrapped in {{}}, but people tended to do it anyway for symmetry.) This does not introduce any ambiguity since "[[" was always followed by an identifier or '@' or '#', so "[[[" was always an error. I've included a few test updates in this patch just for illustration and testing. There are a couple of hundred tests that could be updated as a follow up, mostly in test/CodeGen/. Differential Revision: https://reviews.llvm.org/D117117 Change-Id: Ia6bc6f65cb69734821c911f54a43fe1c673bcca7
2022-01-13[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative ↵David Sherwood8-7/+16
constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/dag-numsignbits.ll CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357
2022-01-13[LV] Inline CreateSplatIV call for scalar VFs (NFC).Florian Hahn1-1/+20
This is a NFC change split off from D116123, as suggested there. D116123 will remove the last user of CreateSplatIV.
2022-01-13[AArch64] Fix incorrect use of MVT::getVectorNumElements in ↵David Sherwood1-3/+6
AArch64TTIImpl::getVectorInstrCost If we are inserting into or extracting from a scalable vector we do not know the number of elements at runtime, so we can only let the index wrap for fixed-length vectors. Tests added here: Analysis/CostModel/AArch64/sve-insert-extract.ll Differential Revision: https://reviews.llvm.org/D117099
2022-01-13RuntimeDyldELF: Don't abort on R_AARCH64_NONE relocationVladislav Khmelevsky1-0/+2
Do nothing on R_AARCH64_NONE relocation. The relocation is used by BOLT when re-linking the final binary. It is used as a dummy relocation hack in order to stop the RuntimeDyld to skip the allocation of the section. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D117066
2022-01-13[JITLink] Add fixup value range checkluxufan1-6/+28
This patch makes jitlink to report an out of range error when the fixup value out of range Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D107328
2022-01-13[M68k][NFC] Use Register instead of unsigned intJim Lin3-22/+22
2022-01-13[NVPTX] Lower fp16 fminnum, fmaxnum to native on sm_80.Christian Sigg2-4/+34
Reviewed By: bkramer, tra Differential Revision: https://reviews.llvm.org/D117122
2022-01-12[CSKY] Ensure a newline at the end of a file (NFC)Kazu Hirata1-1/+1
2022-01-13Revert "[Inline] Attempt to delete any discardable if unused functions"James Y Knight1-35/+17
Somehow this ends up causing an infinite loop in the inliner. This reverts commit d5be48c66d3e5e8be21805c3a33dc67a20e258be.
2022-01-13[RISCV] Add bfp and bfpw intrinsic in zbf extensionLian Wang3-0/+31
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116994
2022-01-12[Attributor] Simplify how we handle required alignment during heap-to-stack ↵Philip Reames1-16/+14
[NFC] The existing code duplicated the same concern in two places, and (weirdly) changed the inference of the allocation size based on whether we could meet the alignment requirement. Instead, just directly check the allocation requirement.
2022-01-12[Attributor] Generalize calloc handling in heap-to-stack for any init value ↵Philip Reames1-12/+12
[NFC] Rewrite the calloc specific handling in heap-to-stack to allow arbitrary init values. The basic problem being solved is that if an allocation is initilized to anything other than zero, this must be explicitly done for the formed alloca as well. This covers the calloc case today, but once a couple of earlier guards are removed in this code, downstream allocators with other init values could also be handled. Inspired by discussion on D116971
2022-01-12[Attributor] Reuse object size evaluation code [NFC]Philip Reames1-8/+8
2022-01-12[Attributor] Use getAllocAlignment where possible [NFC]Philip Reames1-5/+7
Inspired by D116971.
2022-01-12AMDGPU: Fix assert on function argument as loop conditionMatt Arsenault1-0/+6
2022-01-12[AMDGPU] Fixed physreg asm constraint parsingStanislav Mekhanoshin2-29/+41
We are always failing parsing of the physreg constraint because we do not drop trailing brace, thus getAsInteger() returns a non-empty string and we delegate reparsing to the TargetLowering. In addition it did not parse register tuples. Fixed which has allowed to remove w/a in two places we call it. Differential Revision: https://reviews.llvm.org/D117055
2022-01-12GlobalISel: Always enable GISelKnownBits for InstructionSelectMatt Arsenault1-4/+4
This wasn't running at -O0, and causing crashes for AMDGPU. AMDGPU needs this to match the addressing modes of stack access instructions, which is even more important at -O0 than with optimizations. It currently costs nothing to run ahead of time, so just always enable it.
2022-01-12RegScavenger: Remove used regs from scavenge candidatesMatt Arsenault1-0/+22
In a future change, AMDGPU will have 2 emergency scavenging indexes in some situations. The secondary scavenging index ends up being used recursively when the scavenger calls eliminateFrameIndex for the emergency spill slot. Without this, it would end up seeing the same register which was just scavenged in the parent call as free, inserts a second emergency spill to the same location and returns the same register when 2 unique free registers are required. We need to only do this if the register is used. SystemZ uses 2 scavenging slots, but calls the scavenger twice in sequence and not recursively. In this case the previously scavenged register can be re-clobbered, but is still tracked in the scavenger until it sees the deferred restore instruction.
2022-01-12AMDGPU/GlobalISel: Fix assertions on legalize queries with huge alignMatt Arsenault1-5/+5
For some reason we pass around the alignment in bits as uint64_t. Two places were truncating it to unsigned, and losing bits in extreme cases.
2022-01-12GlobalISel: Add G_ASSERT_ALIGN hint instructionMatt Arsenault5-15/+46
Insert it for call return values only for now, which is the only case the DAG handles also.
2022-01-12clang support for Armv8.8/9.3 HBCTomas Matheson1-0/+2
This introduces clang command line support for new Armv8.8-A and Armv9.3-A Hinted Conditional Branches feature, previously introduced into LLVM in https://reviews.llvm.org/D116156. Patch by Tomas Matheson and Son Tuan Vu. Differential Revision: https://reviews.llvm.org/D116939
2022-01-12[Demangle] Pass Ret parameter from decodeNumber by referenceLuís Ferreira1-5/+5
Since Ret parameter is never meant to be nullptr, let's pass it by reference instead of a raw pointer. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D117046
2022-01-12[Demangle] Add support for D types back referencingLuís Ferreira1-2/+52
This patch adds support for type back referencing, allowing demangling of compressed mangled symbols with repetitive types. Signed-off-by: Luís Ferreira <contact@lsferreira.net> Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111419
2022-01-12[Demangle] Add support for D symbols back referencingLuís Ferreira1-3/+137
This patch adds support for identifier back referencing allowing compressed mangled names by avoiding repetitiveness. Signed-off-by: Luís Ferreira <contact@lsferreira.net> Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111417
2022-01-12[Demangle] Add minimal support for D simple basic typesLuís Ferreira1-2/+36
This patch implements simple demangling of two basic types to add minimal type functionality. This will be later used in function type parsing. After that being implemented we can add the rest of the types and test the result of the type name. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111416
2022-01-12[InstSimplify] use knownbits to fold more udiv/uremSanjay Patel1-0/+10
We could use knownbits on both operands for even more folds (and there are already tests in place for that), but this is enough to recover the example from: https://github.com/llvm/llvm-project/issues/51934 (the tests are derived from the code in that example) I am assuming no noticeable compile-time impact from this because udiv/urem are rare opcodes. Differential Revision: https://reviews.llvm.org/D116616
2022-01-12Revert "[JITLink][AArch64] Add support for splitting eh-frames on AArch64."Nico Weber1-7/+0
This reverts commit 253ce92844f72e3a6d0e423473f2765c2c5afd6a. Breaks tests on Windows, see https://github.com/llvm/llvm-project/issues/52921#issuecomment-1011118896
2022-01-12[RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be ↵Alex Bradbury3-10/+13
experimental Agreed policy is that RISC-V extensions that have not yet been ratified should be marked as experimental, and enabling them requires the use of the -menable-experimental-extensions flag when using clang alongside the version number. These extensions have now been ratified, so this is no longer necessary, and the target feature names can be renamed to no longer be prefixed with "experimental-". Differential Revision: https://reviews.llvm.org/D117131
2022-01-12AMDGPU/GlobalISel: Do not use terminator copy before waterfall loopsMatt Arsenault1-4/+7
Stop using the _term variants of the mov to save the initial exec value before the waterfall loop. This cannot be glued to the bottom of the block because we may need to spill the result register. Just use a regular mov, like the loops produced on the DAG path. Fixes some verification errors with regalloc fast.
2022-01-12GlobalISel: Fix insert point in localizerMatt Arsenault1-3/+4
This was inserting the new G_CONSTANT after the use, and the later block scan would run off the end. Fix calling SkipPHIsAndLabels for no apparent reason.
2022-01-12[ModuleInliner] Properly delete dead functionsArthur Eubanks1-8/+1
Followup to D116964 where we only did this in the CGSCC inliner. Fixes leaks reported in D116964.
2022-01-12[RISCV] Add RISCVProcFamilyEnum and add SiFive7.Craig Topper3-9/+31
Use it to remove explicit string compares from unrolling preferences. I'm of two minds on this. Ideally, we would define things in terms of architectural or microarchitectural features, but it's hard to do that with things like unrolling preferences without just ending up with FeatureSiFive7UnrollingPreferences. Having a proc enum is consistent with ARM and AArch64. X86 only has a few and is trying to move away from it. Reviewed By: asb, mcberg2021 Differential Revision: https://reviews.llvm.org/D117060
2022-01-12[NFC][MLGO] The regalloc reward is float, not int64_tMircea Trofin1-1/+1
2022-01-12[NFC][MLGO] Prep a few files before the main ML Regalloc adviserMircea Trofin2-8/+5
To avoid trivial changes.
2022-01-12GlobalIsel: Fix fma combine when one of the operands comes from unmergePetar Avramovic1-83/+89
Fma combine assumes that MRI.getVRegDef(Reg)->getOperand(0).getReg() = Reg which is not true when Reg is defined by instruction with multiple defs e.g. G_UNMERGE_VALUES. Fix is to keep register and the instruction that defines register in DefinitionAndSourceRegister and use when needed. Differential Revision: https://reviews.llvm.org/D117032
2022-01-12[Inline] Attempt to delete any discardable if unused functionsArthur Eubanks1-17/+35
Previously we limited ourselves to only internal/private functions. We can also delete linkonce_odr functions. Minor compile time wins: https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=instructions Major memory wins on tramp3d: https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=max-rss Reviewed By: nikic, mtrofin Differential Revision: https://reviews.llvm.org/D115545
2022-01-12[X86][AVX2] Add SimplifyDemandedVectorElts handling for avx2 per element shiftsSimon Pilgrim1-1/+4
Noticed while investigating how to improve funnel shift codegen
2022-01-12Revert "[llvm-readobj][XCOFF] dump auxiliary symbols."Nico Weber1-5/+1
This reverts commit aad49c8eb9849be57c562f8e2b7fbbe816183343. Breaks tests on Windows, see comments on https://reviews.llvm.org/D113825
2022-01-12[MachO] Port call graph profile section and directiveLeonard Grey4-1/+67
This ports the `.cg_profile` assembly directive and call graph profile section generation to MachO from COFF/ELF. Due to MachO section naming rules, the section is called `__LLVM,__cg_profile` rather than `.llvm.call-graph-profile` as in COFF/ELF. Support for llvm-readobj is included to facilitate testing. Corresponding LLD change is D112164 Differential Revision: https://reviews.llvm.org/D112160
2022-01-12[VPlan] Introduce and use BranchOnCount VPInstruction.Florian Hahn4-24/+78
This patch adds a new BranchOnCount VPInstruction opcode with 2 operands. It first compares its 2 operands (increment of canonical induction and vector trip count), followed by a branch to either the exit block or back to the vector header. It must be the last recipe in the exit block of the topmost vector loop region. This extracts parts from D113224 and was discussed in D113223. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116479
2022-01-12[LoopVectorize] Pass a vector type to isLegalMaskedGather/ScatterRosie Sumpter7-74/+96
This is required to query the legality more precisely in the LoopVectorizer. This adds another TTI function named 'forceScalarizeMaskedGather/Scatter' function to work around the hack introduced for MVE, where isLegalMaskedGather/Scatter would return an answer by second-guessing where the function was called from, based on the Type passed in (vector vs scalar). The new interface makes this explicit. It is also used by X86 to check for vector widths where gather/scatters aren't profitable (or don't exist) for certain subtargets. Differential Revision: https://reviews.llvm.org/D115329
2022-01-12[DebugInfo] Move flag for instr-ref to LLVM option, from TargetOptionsJeremy Morse4-19/+21
This feature was previously controlled by a TargetOptions flag, and I figured that codegen::InitTargetOptionsFromCodeGenFlags would default it to "on" for all frontends. Enabling by default was discussed here: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153653.html and originally supposed to happen in 3c045070882f3, but it didn't actually take effect, as it turns out frontends initialize TargetOptions themselves. This patch moves the flag from a TargetOptions flag to a global flag to CodeGen, where it isn't immediately affected by the frontend being used. Hopefully this will actually cause instr-ref to be on by default on x86_64 now! This patch is easily reverted, and chances of turbulence are moderately high. If you need to revert, please consider instead commenting out the 'return true' part of llvm::debuginfoShouldUseDebugInstrRef to turn the feature off, and dropping me an email. Differential Revision: https://reviews.llvm.org/D116821
2022-01-12[VP] llvm.vp.merge intrinsic and LangRefSimon Moll1-0/+1
llvm.vp.merge interprets the %evl operand differently than the other vp intrinsics: all lanes at positions greater or equal than the %evl operand are passed through from the second vector input. Otherwise it behaves like llvm.vp.select. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116725
2022-01-12[X86][XOP] Add SimplifyDemandedVectorElts handling for xop shiftsSimon Pilgrim1-0/+15
Noticed while investigating how to improve funnel shift codegen