Age | Commit message (Collapse) | Author | Files | Lines |
|
Remove the Native Client support now that it has finally reached end of life.
|
|
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the `llvm-c` interface with a
new `LLVM_C_ABI` annotation, which behaves like the `LLVM_ABI`. This
annotation currently has no meaningful impact on the LLVM build;
however, it is a prerequisite to support an LLVM Windows DLL (shared
library) build.
## Overview
1. Add a new `llvm-c/Visibility.h` header file that defines a new
`LLVM_C_ABI` macro. The macro resolves to the proper DLL export/import
annotation on Windows and a "default" visibility annotation elsewhere.
2. Add a new `LLVM_ENABLE_LLVM_C_EXPORT_ANNOTATIONS` `#cmakedefine` that
is used to gate the definition of `LLVM_C_ABI`.
3. Remove the existing `LLVM_C_ABI` definition from
`llvm/Support/Compiler.h`. Update the small number of `LLVM_C_ABI`
references to get it from the new `llvm-c/Visibility.h` header.
4. Code-mod annotate the public `llvm-c` interface using the [Interface
Definition Scanner (IDS)](https://github.com/compnerd/ids) tool.
5. Format the changes with `clang-format`.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
|
|
|
|
This patch optimizes the Windows security cookie check mechanism by
moving the comparison inline and only calling __security_check_cookie
when the check fails. This reduces the overhead of making a DLL call
for every function return.
Previously, we implemented this optimization through a machine pass
(X86WinFixupBufferSecurityCheckPass) in PR #95904 submitted by
@mahesh-attarde. We have reverted that pass in favor of this new
approach. Also we have abandoned the AArch64 specific implementation
of same pass in PR #121938 in favor of this more general solution.
The old machine instruction pass approach:
- Scanned the generated code to find __security_check_cookie calls
- Modified these calls by splitting basic blocks
- Added comparison logic and conditional branching
- Required complex block management and live register computation
The new approach:
- Implements the same optimization during instruction selection
- Directly emits the comparison and conditional branching
- No need for post-processing or basic block manipulation
- Disables optimization at -Oz.
Thanks @tamaspetz, @efriedma-quic and @arsenm for their help.
|
|
We rename `createGenericSchedLive` and `createGenericSchedPostRA`
to `createSchedLive` and `createSchedPostRA`, and add a template
parameter `Strategy` which is the generic implementation by default.
This can simplify some code for targets that have custom scheduler
strategy.
|
|
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
|
|
(equivalent to MSVC /d2epilogunwind) (#129142)
Adds support for emitting Windows x64 Unwind V2 information, includes
support `/d2epilogunwind` in clang-cl.
Unwind v2 adds information about the epilogs in functions such that the
unwinder can unwind even in the middle of an epilog, without having to
disassembly the function to see what has or has not been cleaned up.
Unwind v2 requires that all epilogs are in "canonical" form:
* If there was a stack allocation (fixed or dynamic) in the prolog, then
the first instruction in the epilog must be a stack deallocation.
* Next, for each `PUSH` in the prolog there must be a corresponding
`POP` instruction in exact reverse order.
* Finally, the epilog must end with the terminator.
This change adds a pass to validate epilogs in modules that have Unwind
v2 enabled and, if they pass, emits new pseudo instructions to MC that
1) note that the function is using unwind v2 and 2) mark the start of
the epilog (this is either the first `POP` if there is one, otherwise
the terminator instruction). If a function does not meet these
requirements, it is downgraded to Unwind v1 (i.e., these new pseudo
instructions are not emitted).
Note that the unwind v2 table only marks the size of the epilog in the
"header" unwind code, but it's possible for epilogs to use different
terminator instructions thus they are not all the same size. As a work
around for this, MC will assume that all terminator instructions are
1-byte long - this still works correctly with the Windows unwinder as it
is only using the size to do a range check to see if a thread is in an
epilog or not, and since the instruction pointer will never be in the
middle of an instruction and the terminator is always at the end of an
epilog the range check will function correctly. This does mean, however,
that the "at end" optimization (where an epilog unwind code can be
elided if the last epilog is at the end of the function) can only be
used if the terminator is 1-byte long.
One other complication with the implementation is that the unwind table
for a function is emitted during streaming, however we can't calculate
the distance between an epilog and the end of the function at that time
as layout hasn't been completed yet (thus some instructions may be
relaxed). To work around this, epilog unwind codes are emitted via a
fixup. This also means that we can't pre-emptively downgrade a function
to Unwind v1 if one of these offsets is too large, so instead we raise
an error (but I've passed through the location information, so the user
will know which of their functions is problematic).
|
|
Register assembly printer passes in the pass registry.
This makes it possible to use `llc -start-before=<target>-asm-printer ...` in tests.
Adds a `char &ID` parameter to the AssemblyPrinter constructor to allow
targets to use the `INITIALIZE_PASS` macros and register the pass in the
pass registry. This currently has a default parameter so it won't break
any targets that have not been updated.
|
|
Suppress EGPR/NDD instructions for relocations to avoid APX relocation
types emitted. This is to keep backward compatibility with old version
of linkers without APX support. The use case is to try APX features with
LLVM + old built-in linker on RHEL9 OS which is expected to be EOL in
2032.
If there are APX relocation types, the old version of linkers would
raise "unsupported relocation type" error. Example:
```
$ llvm-mc -filetype=obj -o got.o -triple=x86_64-unknown-linux got.s
$ ld got.o -o got.exe
ld: got.o: unsupported relocation type 0x2b
...
$ cat got.s
...
movq foo@GOTPCREL(%rip), %r16
$ llvm-objdump -dr got.o
...
1: d5 48 8b 05 00 00 00 00 movq (%rip), %r16
0000000000000005: R_X86_64_CODE_4_GOTPCRELX foo-0x4
```
|
|
Replace "concept based polymorphism" with simpler PImpl idiom.
This pursues two goals:
* Enforce static type checking. Previously, target implementations hid
base class methods and type checking was impossible. Now that they
override the methods, the compiler will complain on mismatched
signatures.
* Make the code easier to navigate. Previously, if you asked your
favorite LSP server to show a method (e.g. `getInstructionCost()`), it
would show you methods from `TTI`, `TTI::Concept`, `TTI::Model`,
`TTIImplBase`, and target overrides. Now it is two less :)
There are three commits to hopefully simplify the review.
The first commit removes `TTI::Model`. This is done by deriving
`TargetTransformInfoImplBase` from `TTI::Concept`. This is possible
because they implement the same set of interfaces with identical
signatures.
The first commit makes `TargetTransformImplBase` polymorphic, which
means all derived classes should `override` its methods. This is done in
second commit to make the first one smaller. It appeared infeasible to
extract this into a separate PR because the first commit landed
separately would result in tons of `-Woverloaded-virtual` warnings (and
break `-Werror` builds).
The third commit eliminates `TTI::Concept` by merging it with the only
derived class `TargetTransformImplBase`. This commit could be extracted
into a separate PR, but it touches the same lines in
`TargetTransformInfoImpl.h` (removes `override` added by the second
commit and adds `virtual`), so I thought it may make sense to land these
two commits together.
Pull Request: https://github.com/llvm/llvm-project/pull/136674
|
|
targets that aren't `catchret` instructions (#129953)
This change splits out the renaming and comment updates from #129612 as a non-functional change.
|
|
The createSIMachineScheduler & createPostMachineScheduler
target hooks are currently placed in the PassConfig interface.
Moving it out to TargetMachine so that both legacy and
the new pass manager can effectively use them.
|
|
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.
cc @arsenm @aeubanks
|
|
Identified with misc-include-cleaner.
|
|
Switch LLVMInitialize* functions to new the symbol visibility macros
that will work for windows.
This is part of the work to enable LLVM_BUILD_LLVM_DYLIB and plugins on
windows.
|
|
- [x] Add `clearSubtargetInfo` API to TargetMachine and each backend to
make it possible to release memory used in each backend's
`SubtargetInfo` map if needed. Keep this API as `protected` so that it
will be used with precautions.
|
|
|
|
Generalize and improve some target-specific code that emits traps after
stack protector failure in SelectionDAG & GlobalIsel.
|
|
(#106820)
This allows it to show up in -print-before/after-all
|
|
[CodeGen] Change the prototype of regalloc filter function
Change the prototype of the filter function so that we can
filter not just by RegClass. We need to implement more
complicated filter based upon some other info associated
with each register.
Patch provided by: Gang Chen (gangc@amd.com)
|
|
For windows __security_check_cookie call gets call everytime function is return without fixup. Since this function is defined in runtime library, it incures cost of call in dll which simply does comparison and returns most time. With Fixup, We selective move to call in DLL only if comparison fails.
|
|
This allows tested passes to depend on the AMX model in the function
info. Preparatory work for to adopt #94358 for other AMX passes.
|
|
- Fix build with `EXPENSIVE_CHECKS`
- Remove unused `PassName::ID` to resolve warning
- Mark `~SelectionDAGISel` virtual so AArch64 backend can work properly
|
|
This reverts commit de37c06f01772e02465ccc9f538894c76d89a7a1 to
de37c06f01772e02465ccc9f538894c76d89a7a1
It still breaks EXPENSIVE_CHECKS build. Sorry.
|
|
Port selection dag isel to new pass manager.
Only `AMDGPU` and `X86` support new pass version. `-verify-machineinstrs` in new pass manager belongs to verify instrumentation, it is enabled by default.
|
|
* Add support for G_LOAD from G_CONSTANT_POOL on X86 and X64
* Add X86GlobalBaseRegPass to handle base register initialization for
X86.
* Fix vector type legalization for G_STORE and G_LOAD as well as enable
scalarization for them.
* Custom lower G_BUILD_VECTOR into G_LOAD from G_CONSTANT_POOL.
|
|
Is64Bit flag. NFC. (#87479)
Matches what most other targets do and makes it easier to specify code model based off other triple settings in the future.
|
|
Port the `atomicexpand` pass to the new Pass Manager.
Fixes #64559
|
|
If the passes aren't registered, they don't show up in print-after-all.
|
|
32-bit x86 doesn't have an appropriate relocation type we can use to
elide the RTTI proxies, so we need to emit them. This would previously
cause crashes when using the relative vtable ABI for 32-bit x86.
|
|
RFC: https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031
APX introduces EGPR, NDD and NF instructions. In addition to compressing
EVEX encoded AVX512 instructions into VEX encoding, we also have several
more possible optimizations.
a. Promoted instruction (EVEX space) -> pre-promotion instruction (legacy space)
b. NDD (EVEX space) -> non-NDD (legacy space)
c. NF_ND (EVEX space) -> NF (EVEX space)
The first two types of compression can usually reduce code size, while
the third type of compression can help hardware decode although the
instruction length remains unchanged.
So we do the renaming for the upcoming APX optimizations.
BTW, I clang-format the code in X86CompressEVEX.cpp,
X86CompressEVEXTablesEmitter.cpp.
This patch also extracts the NFC in #77065 into a separate commit.
|
|
In https://reviews.llvm.org/D125075, we switched to use
FastPreTileConfig in O0 and abandoned X86PreAMXConfigPass.
we can remove related code of X86PreAMXConfigPass safely.
|
|
This is an attempt at rebooting https://reviews.llvm.org/D28990
I've included AutoUpgrade changes to modify the data layout to satisfy the compatible layout check. But this does mean alloca, loads, stores, etc in old IR will automatically get this new alignment.
This should fix PR46320.
Reviewed By: echristo, rnk, tmgross
Differential Revision: https://reviews.llvm.org/D86310
|
|
(#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
|
|
|
|
broadcast folds
This patch analyzes AVX512 instructions for full vector width folded loads from the constant pool and attempts to determine if it can be replaced with a smaller broadcast folded variant. Typically the broadcast opportunities were missed by type-width mismatches or mulituse limitations which have been removed in later passes.
As well as introducing broadcast fold tables (which can hopefully be extended/automated in the future), this also handles mismatches in the AND/ANDN/OR/XOR/TERNLOG type-widths, catching additional missed opportunities.
This is patch is pulled from the ongoing work based on D150143, but without removing the existing DAG constant broadcast lowering code - this patch is currently a late stage cleanup only.
The intention is to add additional broadcast/extension handling of constants in future patches, but it turned out that AVX512 broadcast handling was the easiest to start with.
Differential Revision: https://reviews.llvm.org/D150526
|
|
KCFI machine function passes transform indirect calls with a
cfi-type attribute into architecture-specific type checks bundled
together with the calls. Instead of having a separate pass for each
architecture, add a generic machine function pass for KCFI and
move the architecture-specific code that emits the actual check to
TargetLowering. This avoids unnecessary duplication and makes it
easier to add KCFI support to other architectures.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D149915
|
|
Fix some bugs and reland e4c1dfed38370b4 and 614c63bec6d67c.
1. Run argument stack rebase pass before the reserved physical register
is finalized.
2. Add LEA pseudo instruction to prevent the instruction being
eliminated.
3. Don't support X32.
|
|
This reverts commit e4c1dfed38370b4933f05c8e24b1d77df56b526c.
|
|
The base pointer register is reserved by compiler when there is
dynamic size alloca and stack realign in a function. However the
base pointer register is not defined in X86 ABI, so user can use
this register in inline assembly. The inline assembly would
clobber base pointer register without being awared by user. This
patch is to create extra prolog to save the stack pointer to a
scratch register and use this register to reference argument from
stack. For some calling convention (e.g. regcall), there may be
few scratch register.
Below is the example code for such case.
```
extern int bar(void *p);
long long foo(size_t size, char c, int id) {
__attribute__((__aligned__(64))) int a;
char *p = (char *)alloca(size);
asm volatile ("nop"::"S"(405):);
asm volatile ("movl %0, %1"::"r"(id), "m"(a):);
p[2] = 8;
memset(p, c, size);
return bar(p);
}
```
And below prolog/epilog will be emit for this case.
```
leal 4(%esp), %ebx
.cfi_def_cfa %ebx, 0
andl $-128, %esp
pushl -4(%ebx)
...
leal 4(%ebx), %esp
.cfi_def_cfa %esp, 4
```
Differential Revision: https://reviews.llvm.org/D145650
|
|
There are a variety of cases where we want more control over the exact
instruction emitted. This commit creates a new pass to fixup
instructions after the DAG has been lowered. The pass is only meant to
replace instructions that are guranteed to be interchangable, not to
do analysis for special cases.
Handling these instruction changes in in X86ISelLowering of
X86ISelDAGToDAG isn't ideal, as its liable to either break existing
patterns that expected a certain instruction or generate infinite
loops.
As well, operating as the MachineInstruction level allows us to access
scheduling/code size information for making the decisions.
Currently only implements `{v}permilps` -> `{v}shufps/{v}shufd` but
more transforms can be added.
Differential Revision: https://reviews.llvm.org/D143787
|
|
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
|
|
This reverts commit ed01de67433174d3157e9d239d59dd465d52c6a5.
|
|
Currently default simd alignment is specified by Clang specific TargetInfo
class. This class cannot be reused for LLVM Flang. If we move the default
alignment field into TargetMachine class then we can create TargetMachine
objects and query them to find SIMD alignment.
Scope of changes:
1) Added information about maximal allowed SIMD alignment to TargetMachine
classes.
2) Removed getSimdDefaultAlign function from Clang TargetInfo class.
3) Refactored createTargetMachine function.
Reviewed By: jsjodin
Differential Revision: https://reviews.llvm.org/D138496
|
|
This fixes what I consider to be an API flaw I've tripped over
multiple times. The point this is constructed isn't well defined, so
depending on where this is first called, you can conclude different
information based on the MachineFunction. For example, the AMDGPU
implementation inspected the MachineFrameInfo on construction for the
stack objects and if the frame has calls. This kind of worked in
SelectionDAG which visited all allocas up front, but broke in
GlobalISel which hasn't visited any of the IR when arguments are
lowered.
I've run into similar problems before with the MIR parser and trying
to make use of other MachineFunction fields, so I think it's best to
just categorically disallow dependency on the MachineFunction state in
the constructor and to always construct this at the same time as the
MachineFunction itself.
A missing feature I still could use is a way to access an custom
analysis pass on the IR here.
|
|
Follow a similar pattern as AMDGPUDAGToDAGISel's constructor so that we
can use INITIALIZE_PASS to register a pass. This allows for more fine
grain testability of SelectionDAGISel.
Link: https://github.com/llvm/llvm-project/issues/59538
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D140323
|
|
|
|
|
|
|
|
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a
forward-edge control flow integrity scheme for indirect calls. It
uses a !kcfi_type metadata node to attach a type identifier for each
function and injects verification code before indirect calls.
Unlike the current CFI schemes implemented in LLVM, KCFI does not
require LTO, does not alter function references to point to a jump
table, and never breaks function address equality. KCFI is intended
to be used in low-level code, such as operating system kernels,
where the existing schemes can cause undue complications because
of the aforementioned properties. However, unlike the existing
schemes, KCFI is limited to validating only function pointers and is
not compatible with executable-only memory.
KCFI does not provide runtime support, but always traps when a
type mismatch is encountered. Users of the scheme are expected
to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi`
operand bundle to indirect calls, and LLVM lowers this to a
known architecture-specific sequence of instructions for each
callsite to make runtime patching easier for users who require this
functionality.
A KCFI type identifier is a 32-bit constant produced by taking the
lower half of xxHash64 from a C++ mangled typename. If a program
contains indirect calls to assembly functions, they must be
manually annotated with the expected type identifiers to prevent
errors. To make this easier, Clang generates a weak SHN_ABS
`__kcfi_typeid_<function>` symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one C translation unit linked into the program
takes the function address. For example on AArch64, we might have
the following code:
```
.c:
int f(void);
int (*p)(void) = f;
p();
.s:
.4byte __kcfi_typeid_f
.global f
f:
...
```
Note that X86 uses a different preamble format for compatibility
with Linux kernel tooling. See the comments in
`X86AsmPrinter::emitKCFITypeId` for details.
As users of KCFI may need to locate trap locations for binary
validation and error handling, LLVM can additionally emit the
locations of traps to a `.kcfi_traps` section.
Similarly to other sanitizers, KCFI checking can be disabled for a
function with a `no_sanitize("kcfi")` function attribute.
Relands 67504c95494ff05be2a613129110c9bcf17f6c13 with a fix for
32-bit builds.
Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay
Differential Revision: https://reviews.llvm.org/D119296
|