Age | Commit message (Collapse) | Author | Files | Lines |
|
Since 26c6a3e736d3, LLVM's inliner will "upgrade" the caller's stack protector
attribute based on the callee. This lead to surprising results with Clang's
no_stack_protector attribute added in 4fbf84c1732f (D46300). Consider the
following code compiled with clang -fstack-protector-strong -Os
(https://godbolt.org/z/7s3rW7a1q).
extern void h(int* p);
inline __attribute__((always_inline)) int g() {
return 0;
}
int __attribute__((__no_stack_protector__)) f() {
int a[1];
h(a);
return g();
}
LLVM will inline g() into f(), and f() would get a stack protector, against the
users explicit wishes, potentially breaking the program e.g. if h() changes the
value of the stack cookie. That's a miscompile.
More recently, bc044a88ee3c (D91816) addressed this problem by preventing
inlining when the stack protector is disabled in the caller and enabled in the
callee or vice versa. However, the problem remained if the callee is marked
always_inline as in the example above. This affected users, see e.g.
http://crbug.com/1274129 and http://llvm.org/pr52886.
One way to fix this would be to prevent inlining also in the always_inline
case. Despite the name, always_inline does not guarantee inlining, so this
would be legal but potentially surprising to users.
However, I think the better fix is to not enable the stack protector in a
caller based on the callee. The motivation for the old behaviour is unclear, it
seems counter-intuitive, and causes real problems as we've seen.
This commit implements that fix, which means in the example above, g() gets
inlined into f() (also without always_inline), and f() is emitted without stack
protector. I think that matches most developers' expectations, and that's also
what GCC does.
Another effect of this change is that a no_stack_protector function can now be
inlined into a stack protected function, e.g. (https://godbolt.org/z/hafP6W856):
extern void h(int* p);
inline int __attribute__((__no_stack_protector__)) __attribute__((always_inline)) g() {
return 0;
}
int f() {
int a[1];
h(a);
return g();
}
I think that's fine. Such code would be unusual since no_stack_protector is
normally applied to a program entry point which sets up the stack canary. And
even if such code exists, inlining doesn't change the semantics: there is still
no stack cookie setup/check around entry/exit of the g() code region, but there
may be in the surrounding context, as there was before inlining. This also
matches GCC.
See also the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722
Differential revision: https://reviews.llvm.org/D116589
|
|
IR:
- globals (and functions, ifuncs, aliases) can have a partition
- catchret has a `to` before the label
- the sint/int types do not exist
- signext comes after the type
- a variable was missing its type
TableGen:
- The second value after a `#` concatenation is optional
See e.g. llvm/lib/Target/X86/X86InstrAVX512.td:L3351
- IncludeDirective and PreprocessorDirective were never referenced in
the grammar
- Add some missing ;
- Parent classes of multiclasses can have generic arguments.
Reuse the `ParentClassList` that is already used in other places.
MIR:
- liveins only allows physical registers, which start with a $
Differential Revision: https://reviews.llvm.org/D116674
|
|
If you want to check for all uses of PAC, the SpillsLR argument to
shouldSignReturnAddress should be true instead of false, as that value will be
returned from the function if the other checks fall through.
Reviewed By: miyuki
Differential Revision: https://reviews.llvm.org/D116213
|
|
We need to check that the load/store type is also the same, as this
is no longer implicitly checked through the pointer type.
|
|
Implement support for matching an index from a WebAssembly CALL
instruction. Add test.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D115327
|
|
Change FileCheck to accept patterns like "[[[var...]]" and treat the
excess open brackets at the start as literals.
This makes the patterns for matching assembler output with literal
brackets much cleaner. For example an AMDGPU pattern that used to be
written like:
buffer_store_dwordx2 v{{\[}}[[LO]]:[[HI]]{{\]}}
can now be:
buffer_store_dwordx2 v[[[LO]]:[[HI]]]
(Even before this patch the final close bracket did not need to be
wrapped in {{}}, but people tended to do it anyway for symmetry.)
This does not introduce any ambiguity since "[[" was always followed by
an identifier or '@' or '#', so "[[[" was always an error.
I've included a few test updates in this patch just for illustration and
testing. There are a couple of hundred tests that could be updated as a
follow up, mostly in test/CodeGen/.
Differential Revision: https://reviews.llvm.org/D117117
Change-Id: Ia6bc6f65cb69734821c911f54a43fe1c673bcca7
|
|
constants
When we know the value we're extending is a negative constant then it
makes sense to use SIGN_EXTEND because this may improve code quality in
some cases, particularly when doing a constant splat of an unpacked vector
type. For example, for SVE when splatting the value -1 into all elements
of a vector of type <vscale x 2 x i32> the element type will get promoted
from i32 -> i64. In this case we want the splat value to sign-extend from
(i32 -1) -> (i64 -1), whereas currently it zero-extends from
(i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use
a single mov immediate instruction.
New tests added here:
CodeGen/AArch64/sve-vector-splat.ll
I believe we see some code quality improvements in these existing
tests too:
CodeGen/AArch64/dag-numsignbits.ll
CodeGen/AArch64/reduce-and.ll
CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll
The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only
occur because the test disables codegen prepare and branch folding.
Differential Revision: https://reviews.llvm.org/D114357
|
|
This is a NFC change split off from D116123, as suggested there.
D116123 will remove the last user of CreateSplatIV.
|
|
AArch64TTIImpl::getVectorInstrCost
If we are inserting into or extracting from a scalable vector we do
not know the number of elements at runtime, so we can only let the
index wrap for fixed-length vectors.
Tests added here:
Analysis/CostModel/AArch64/sve-insert-extract.ll
Differential Revision: https://reviews.llvm.org/D117099
|
|
Do nothing on R_AARCH64_NONE relocation. The relocation is used by BOLT when re-linking the final binary. It is used as a dummy relocation hack in order to stop the RuntimeDyld to skip the allocation of the section.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D117066
|
|
This patch makes jitlink to report an out of range error when the fixup value out of range
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D107328
|
|
|
|
Reviewed By: bkramer, tra
Differential Revision: https://reviews.llvm.org/D117122
|
|
|
|
Somehow this ends up causing an infinite loop in the inliner.
This reverts commit d5be48c66d3e5e8be21805c3a33dc67a20e258be.
|
|
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D116994
|
|
[NFC]
The existing code duplicated the same concern in two places, and (weirdly) changed the inference of the allocation size based on whether we could meet the alignment requirement. Instead, just directly check the allocation requirement.
|
|
[NFC]
Rewrite the calloc specific handling in heap-to-stack to allow arbitrary init values. The basic problem being solved is that if an allocation is initilized to anything other than zero, this must be explicitly done for the formed alloca as well.
This covers the calloc case today, but once a couple of earlier guards are removed in this code, downstream allocators with other init values could also be handled.
Inspired by discussion on D116971
|
|
|
|
Inspired by D116971.
|
|
|
|
We are always failing parsing of the physreg constraint because
we do not drop trailing brace, thus getAsInteger() returns a
non-empty string and we delegate reparsing to the TargetLowering.
In addition it did not parse register tuples.
Fixed which has allowed to remove w/a in two places we call it.
Differential Revision: https://reviews.llvm.org/D117055
|
|
This wasn't running at -O0, and causing crashes for AMDGPU. AMDGPU
needs this to match the addressing modes of stack access instructions,
which is even more important at -O0 than with optimizations.
It currently costs nothing to run ahead of time, so just always enable
it.
|
|
In a future change, AMDGPU will have 2 emergency scavenging indexes in
some situations. The secondary scavenging index ends up being used
recursively when the scavenger calls eliminateFrameIndex for the
emergency spill slot. Without this, it would end up seeing the same
register which was just scavenged in the parent call as free, inserts
a second emergency spill to the same location and returns the same
register when 2 unique free registers are required.
We need to only do this if the register is used. SystemZ uses 2
scavenging slots, but calls the scavenger twice in sequence and not
recursively. In this case the previously scavenged register can be
re-clobbered, but is still tracked in the scavenger until it sees the
deferred restore instruction.
|
|
For some reason we pass around the alignment in bits as uint64_t. Two
places were truncating it to unsigned, and losing bits in extreme
cases.
|
|
Insert it for call return values only for now, which is the only case
the DAG handles also.
|
|
This introduces clang command line support for new Armv8.8-A and
Armv9.3-A Hinted Conditional Branches feature, previously introduced
into LLVM in https://reviews.llvm.org/D116156.
Patch by Tomas Matheson and Son Tuan Vu.
Differential Revision: https://reviews.llvm.org/D116939
|
|
Since Ret parameter is never meant to be nullptr, let's pass it by reference instead of a raw pointer.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D117046
|
|
This patch adds support for type back referencing, allowing demangling of
compressed mangled symbols with repetitive types.
Signed-off-by: Luís Ferreira <contact@lsferreira.net>
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111419
|
|
This patch adds support for identifier back referencing allowing compressed
mangled names by avoiding repetitiveness.
Signed-off-by: Luís Ferreira <contact@lsferreira.net>
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111417
|
|
This patch implements simple demangling of two basic types to add minimal type functionality. This will be later used in function type parsing. After that being implemented we can add the rest of the types and test the result of the type name.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111416
|
|
We could use knownbits on both operands for even more folds (and there are
already tests in place for that), but this is enough to recover the example
from:
https://github.com/llvm/llvm-project/issues/51934
(the tests are derived from the code in that example)
I am assuming no noticeable compile-time impact from this because udiv/urem
are rare opcodes.
Differential Revision: https://reviews.llvm.org/D116616
|
|
This reverts commit 253ce92844f72e3a6d0e423473f2765c2c5afd6a.
Breaks tests on Windows, see
https://github.com/llvm/llvm-project/issues/52921#issuecomment-1011118896
|
|
experimental
Agreed policy is that RISC-V extensions that have not yet been ratified
should be marked as experimental, and enabling them requires the use of
the -menable-experimental-extensions flag when using clang alongside the
version number. These extensions have now been ratified, so this is no
longer necessary, and the target feature names can be renamed to no
longer be prefixed with "experimental-".
Differential Revision: https://reviews.llvm.org/D117131
|
|
Stop using the _term variants of the mov to save the initial exec
value before the waterfall loop. This cannot be glued to the bottom of
the block because we may need to spill the result register. Just use a
regular mov, like the loops produced on the DAG path. Fixes some
verification errors with regalloc fast.
|
|
This was inserting the new G_CONSTANT after the use, and the later
block scan would run off the end. Fix calling SkipPHIsAndLabels for no
apparent reason.
|
|
Followup to D116964 where we only did this in the CGSCC inliner.
Fixes leaks reported in D116964.
|
|
Use it to remove explicit string compares from unrolling preferences.
I'm of two minds on this. Ideally, we would define things in terms
of architectural or microarchitectural features, but it's hard to
do that with things like unrolling preferences without just ending up
with FeatureSiFive7UnrollingPreferences.
Having a proc enum is consistent with ARM and AArch64. X86 only has
a few and is trying to move away from it.
Reviewed By: asb, mcberg2021
Differential Revision: https://reviews.llvm.org/D117060
|
|
|
|
To avoid trivial changes.
|
|
Fma combine assumes that MRI.getVRegDef(Reg)->getOperand(0).getReg() = Reg
which is not true when Reg is defined by instruction with multiple defs
e.g. G_UNMERGE_VALUES.
Fix is to keep register and the instruction that defines register in
DefinitionAndSourceRegister and use when needed.
Differential Revision: https://reviews.llvm.org/D117032
|
|
Previously we limited ourselves to only internal/private functions. We
can also delete linkonce_odr functions.
Minor compile time wins:
https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=instructions
Major memory wins on tramp3d:
https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=max-rss
Reviewed By: nikic, mtrofin
Differential Revision: https://reviews.llvm.org/D115545
|
|
Noticed while investigating how to improve funnel shift codegen
|
|
This reverts commit aad49c8eb9849be57c562f8e2b7fbbe816183343.
Breaks tests on Windows, see comments on https://reviews.llvm.org/D113825
|
|
This ports the `.cg_profile` assembly directive and call graph profile section
generation to MachO from COFF/ELF. Due to MachO section naming rules, the
section is called `__LLVM,__cg_profile` rather than `.llvm.call-graph-profile`
as in COFF/ELF. Support for llvm-readobj is included to facilitate testing.
Corresponding LLD change is D112164
Differential Revision: https://reviews.llvm.org/D112160
|
|
This patch adds a new BranchOnCount VPInstruction opcode with 2
operands. It first compares its 2 operands (increment of canonical
induction and vector trip count), followed by a branch to either the
exit block or back to the vector header.
It must be the last recipe in the exit block of the topmost vector loop
region.
This extracts parts from D113224 and was discussed in D113223.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D116479
|
|
This is required to query the legality more precisely in the LoopVectorizer.
This adds another TTI function named 'forceScalarizeMaskedGather/Scatter'
function to work around the hack introduced for MVE, where
isLegalMaskedGather/Scatter would return an answer by second-guessing
where the function was called from, based on the Type passed in (vector
vs scalar). The new interface makes this explicit. It is also used by
X86 to check for vector widths where gather/scatters aren't profitable
(or don't exist) for certain subtargets.
Differential Revision: https://reviews.llvm.org/D115329
|
|
This feature was previously controlled by a TargetOptions flag, and I
figured that codegen::InitTargetOptionsFromCodeGenFlags would default it
to "on" for all frontends. Enabling by default was discussed here:
https://lists.llvm.org/pipermail/llvm-dev/2021-November/153653.html
and originally supposed to happen in 3c045070882f3, but it didn't actually
take effect, as it turns out frontends initialize TargetOptions themselves.
This patch moves the flag from a TargetOptions flag to a global flag to
CodeGen, where it isn't immediately affected by the frontend being used.
Hopefully this will actually cause instr-ref to be on by default on x86_64
now!
This patch is easily reverted, and chances of turbulence are moderately
high. If you need to revert, please consider instead commenting out the
'return true' part of llvm::debuginfoShouldUseDebugInstrRef to turn the
feature off, and dropping me an email.
Differential Revision: https://reviews.llvm.org/D116821
|
|
llvm.vp.merge interprets the %evl operand differently than the other vp
intrinsics: all lanes at positions greater or equal than the %evl
operand are passed through from the second vector input. Otherwise it
behaves like llvm.vp.select.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D116725
|
|
Noticed while investigating how to improve funnel shift codegen
|