| Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In many cases we can infer that class object has been realized
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Generation
- Dispatch
|
|
|
|
|
|
1. Add a flag
2. Clean up and set up helper functions to implement later
Signed-off-by: Peter Rong <PeterRong@meta.com>
|
|
|
|
|
|
|
|
1. GenerateDirectMethodsPreconditionCheck: Move some functionalities to a separate functions.
Those functions will be reused if we move precondition checks into a thunk
2. Create `DirectMethodInfo`, which will be used to manage true implementation and its thunk
|
|
prefix (#170616)
## TL;DR
This is a stack of PRs implementing features to expose direct methods
ABI.
You can see the RFC, design, and discussion
[here](https://discourse.llvm.org/t/rfc-optimizing-code-size-of-objc-direct-by-exposing-function-symbols-and-moving-nil-checks-to-thunks/88866).
The stack of the following four PRs completes the whole feature.
https://github.com/llvm/llvm-project/pull/170616 **Flag
`-fobjc-direct-precondition-thunk` set up**
https://github.com/llvm/llvm-project/pull/170617 Code refactoring to
ease later reviews
https://github.com/llvm/llvm-project/pull/170618 Thunk generation
https://github.com/llvm/llvm-project/pull/170619 Optimizations, some
class objects can be known to be realized
## Implementation details
1. Add a flag. I used `-fobjc-direct-precondition-thunk` instead of
`-fobjc-direct-caller-thunks` as discussed in this PR.
2. Clean up and set up helper functions to implement later
a. `canMessageReceiverBeNull` / `canClassObjectBeUnrealized` these two
functions will be helpful later to determine which function (true
implementation or nil check thunk) we should dispatch a call to.
Formatting.
b. `getSymbolNameForMethod` has a new argument `includePrefixByte`,
which allows us to erase the prefixing `\01` when the flag is enabled
c. `usePreconditionThunk` is the single source of truth of what we
should do. It not only checks for the flag, but also whether the method
is qualified and we are in the right runtime. A method that
`usePreconditionThunk` is either `shouldHavePreconditionThunk` or
`shouldHavePreconditionInline`.
## Tests
Driver tests
---------
Signed-off-by: Peter Rong <PeterRong@meta.com>
Co-authored-by: Kyungwoo Lee <kyulee@meta.com>
|
|
with config macros (#174034)
When a PCH is compiled with macro definitions on the command line, such
as `-DCONFIG1`, an unexpected warning can occur if the macro definitions
happen to belong to an imported module's config macros. The warning may
look like the following:
```
definition of configuration macro 'CONFIG1' has no effect on the import of 'Mod1'; pass '-DCONFIG1=...' on the command line to configure the module
```
while `-DCONFIG1` is clearly on the command line when `clang` compiles
the source that uses the PCH and the module.
The reason this can happen is a combination of two things:
1. The logic that checks for config macros is not aware of any command
line macros passed through the PCH
([here](https://github.com/llvm/llvm-project/blob/7976ac990000a58a7474269a3ca95e16aed8c35b/clang/lib/Frontend/CompilerInstance.cpp#L1562)).
2. `clang` _replaces_ the predefined macros on the command line with the
predefined macros from the PCH, which does not include any builtins
([here](https://github.com/llvm/llvm-project/blob/7976ac990000a58a7474269a3ca95e16aed8c35b/clang/lib/Frontend/CompilerInstance.cpp#L679)).
This PR teaches the preprocessor to recognize the command line macro
definitions passed transitively through the PCH, so that the error check
does not miss these definitions by mistake. The config macro itself
works fine, and it is only the error check that needs fixing.
rdar://95261458
|
|
#174156 (#174489)
#174156 made all gettors return `Py*` but skipped downcasting where
possible. So restore it by calling `.maybeDowncast`.
|
|
All extra state has been removed from VPWidenSelectRecipe at this point.
There's no benefit of having a separate recipe and Select can easily be
handled by the existing VPWidenRecipe.
PR: https://github.com/llvm/llvm-project/pull/174234
|
|
scalars (#174442)
Before, we were selecting the wrong operand in cases when Scalars
contained duplicate values. Stems from #135797.
Using:
`opt -passes=slp-vectorizer -mtriple=riscv64 -mattr=+v t.ll`
```
target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64"
define void @foo(ptr noalias %A, ptr noalias %B) {
entry:
%0 = load i32, ptr %B
%add = add nsw i32 %0, 1
store i32 %add, ptr %A
%arrayidx.1 = getelementptr inbounds nuw i8, ptr %B, i64 4
%1 = load i32, ptr %arrayidx.1
%add.1 = add nsw i32 %1, 1
%arrayidx2.1 = getelementptr inbounds nuw i8, ptr %A, i64 4
store i32 %add.1, ptr %arrayidx2.1
%arrayidx.2 = getelementptr inbounds nuw i8, ptr %B, i64 8
%2 = load i32, ptr %arrayidx.2
%add.2 = add nsw i32 %2, 1
%arrayidx2.2 = getelementptr inbounds nuw i8, ptr %A, i64 8
store i32 %add.2, ptr %arrayidx2.2
%arrayidx.3 = getelementptr inbounds nuw i8, ptr %B, i64 12
%arrayidx2.3 = getelementptr inbounds nuw i8, ptr %A, i64 12
store i32 %add, ptr %arrayidx2.3
%arrayidx.4 = getelementptr inbounds nuw i8, ptr %B, i64 16
%4 = load i32, ptr %arrayidx.4
%add.4 = add nsw i32 %4, 1
%arrayidx2.4 = getelementptr inbounds nuw i8, ptr %A, i64 16
store i32 %add.4, ptr %arrayidx2.4
%arrayidx.5 = getelementptr inbounds nuw i8, ptr %B, i64 20
%5 = load i32, ptr %arrayidx.5
%add.5 = add nsw i32 %5, 1
%arrayidx2.5 = getelementptr inbounds nuw i8, ptr %A, i64 20
store i32 %add.5, ptr %arrayidx2.5
%arrayidx.6 = getelementptr inbounds nuw i8, ptr %B, i64 24
%6 = load i32, ptr %arrayidx.6
%add.6 = add nsw i32 %6, 1
%arrayidx2.6 = getelementptr inbounds nuw i8, ptr %A, i64 24
store i32 %add.6, ptr %arrayidx2.6
%arrayidx.7 = getelementptr inbounds nuw i8, ptr %B, i64 28
%7 = load i32, ptr %arrayidx.7
%add.7 = add nsw i32 %7, 1
%arrayidx2.7 = getelementptr inbounds nuw i8, ptr %A, i64 28
store i32 %add.7, ptr %arrayidx2.7
ret void
}
```
The following trace is produced, note the wrong operand is used for `Idx
> 2`
Before:
```
GetScalarCost(), Idx=0
UniqueValues[Idx]: %add = add nsw i32 %0, 1
Op1: %0 = load i32, ptr %B, align 4
GetScalarCost(), Idx=1
UniqueValues[Idx]: %add.1 = add nsw i32 %1, 1
Op1: %1 = load i32, ptr %arrayidx.1, align 4
GetScalarCost(), Idx=2
UniqueValues[Idx]: %add.2 = add nsw i32 %2, 1
Op1: %2 = load i32, ptr %arrayidx.2, align 4
GetScalarCost(), Idx=3
UniqueValues[Idx]: %add.4 = add nsw i32 %3, 1
Op1: %0 = load i32, ptr %B, align 4
GetScalarCost(), Idx=4
UniqueValues[Idx]: %add.5 = add nsw i32 %4, 1
Op1: %3 = load i32, ptr %arrayidx.4, align 4
GetScalarCost(), Idx=5
UniqueValues[Idx]: %add.6 = add nsw i32 %5, 1
Op1: %4 = load i32, ptr %arrayidx.5, align 4
GetScalarCost(), Idx=6
UniqueValues[Idx]: %add.7 = add nsw i32 %6, 1
Op1: %5 = load i32, ptr %arrayidx.6, align 4
```
After:
```
GetScalarCost(), Idx=0
UniqueValues[Idx]: %add = add nsw i32 %0, 1
Op1: %0 = load i32, ptr %B, align 4
GetScalarCost(), Idx=1
UniqueValues[Idx]: %add.1 = add nsw i32 %1, 1
Op1: %1 = load i32, ptr %arrayidx.1, align 4
GetScalarCost(), Idx=2
UniqueValues[Idx]: %add.2 = add nsw i32 %2, 1
Op1: %2 = load i32, ptr %arrayidx.2, align 4
GetScalarCost(), Idx=3
UniqueValues[Idx]: %add.4 = add nsw i32 %3, 1
Op1: %3 = load i32, ptr %arrayidx.4, align 4
GetScalarCost(), Idx=4
UniqueValues[Idx]: %add.5 = add nsw i32 %4, 1
Op1: %4 = load i32, ptr %arrayidx.5, align 4
GetScalarCost(), Idx=5
UniqueValues[Idx]: %add.6 = add nsw i32 %5, 1
Op1: %5 = load i32, ptr %arrayidx.6, align 4
GetScalarCost(), Idx=6
UniqueValues[Idx]: %add.7 = add nsw i32 %6, 1
Op1: %6 = load i32, ptr %arrayidx.7, align 4
```
|
|
Setting LIBCXX_HAS_RT_LIB and LIBCXX_HAS_PTHREAD_LIB to OFF
to prevent POSIX dependencies creeping in.
|
|
After initially matching stack nodes to summary we may have multiple
calls per node, e.g. in the case of indirect calls with multiple
profiled callee targets. It is useful to see all of these calls, which
will show up in the poststackupdate dot graph.
|
|
This intrinsic will be useful for implementing the
OpGroupNonUniformShuffle operation in the SPIR-V reference
---------
Signed-off-by: Domenic Nutile <domenic.nutile@gmail.com>
Co-authored-by: Jay Foad <jay.foad@gmail.com>
|
|
(#134805)" (#174483)
This reverts commit ccfb97b42174eab118a4e4222c25e986db876563.
This was reverted due to the unfortunate reliance on external device
library installations, which ship the last rocm released bitcode. The
last attempt was 8 months ago, so hopefully the buildbots are now caught
up to a more recent build that no longer needs the old control library.
|
|
|
|
|
|
This is the behavior of the main not binary that was not preserved in
the internal shell. Make it so that the builtin not command does
actually fail if we end up with a signal rather than just a non-zero
exit code.
Reviewers: petrhosek, ilovepi, jdenny-ornl, arichardson
Pull Request: https://github.com/llvm/llvm-project/pull/174298
|
|
This should not include targets that aren't enabled in cmake.
6ccf97674b2deaa03e271725306b18a712a56113
|
|
`Type.isinstance` (#172892)
We've been able to do `isinstance(x, Type)` for a quite a while now
(since
https://github.com/llvm/llvm-project/commit/bfb1ba752655bf09b35c486f6cc9817dbedfb1bb)
so remove `Type.isinstance` and the the special-casing
(`_is_integer_type`, `_is_floating_point_type`, `_is_index_type`) in
some places (and therefore support various `fp8`, `fp6`, `fp4` types).
|
|
CUF kernel are generated via gpu.launch and then outlined. The resulting
launch operation needs to hava a CUDA attribute attached so it will
be distinguishable from other launch.
|
|
|
|
Update TextDiagnostic and SARIFDiagnostic emitFilename to use the
FileManager's makeAbsolutePath instead of directly calling
make_absolute. This fixes IO sandbox violation errors.
|
|
This PR propagates the already-configured VFS when handling chained
includes, preventing unexpected use of the real FS and sandbox
violations.
|
|
In line with a std proposal to introduce the llvm.clmul family of
intrinsics corresponding to carry-less multiply operations. This work
builds upon 727ee7e ([APInt] Introduce carry-less multiply primitives),
and follow-up patches will introduce custom-lowering on supported
targets, replacing target-specific clmul intrinsics.
Testing is done on the RISC-V target, which should be sufficient to
prove that the intrinsics work, since no RISC-V specific lowering has
been added.
Ref: https://isocpp.org/files/papers/P3642R3.html
Co-authored-by: Craig Topper <craig.topper@sifive.com>
|
|
`switch(X^C)` expressions can be folded to `switch(X)`. Minor
opportunity to generalize simplifications in `visitSwitchInst`
via an inverse function helper as well.
Proof: https://alive2.llvm.org/ce/z/TMRy_3.
Fixes: https://github.com/llvm/llvm-project/issues/174255.
Fixes: https://github.com/llvm/llvm-project/issues/143368.
|
|
Since 8c4950951269ec58296afbeba14e99aef467f84d,
getCanonicalTypeUnqualified() calls getUnqualifiedType(), so there's no
point in calling that again on its return value.
|
|
During the outlining process when offloading acc regions, the body of
the compute kernel is separated from its original location and live-in
values are handled in various ways including becoming function
arguments.
However, some operations are purely synthetic and only make sense when
included with another operation (usually such operations exist to
simplify IR design). Bounds and shapes are examples where during
outlining they should be recreated inside to capture the full
information.
Therefore, introduce a new operation interface named
OutlineRematerializationOpInterface meant to be attached to such
operations. It is currently expected that all such operations are memory
effect free to ensure there are no considerations needed when moving or
cloning them into outlined regions.
The interface is attached to the following operations:
- acc.bounds (directly in TableGen)
- fir.shape (via external model)
- fir.shape_shift (via external model)
- fir.shift (via external model)
- fir.field_index (via external model)
The pass that will use this interface and associated testing will follow
in another pull request.
|
|
files (#174260)
PCHs (but also modules generated from several implicit invocations like
swiftc) previously reported a confusing diagnostic about module caches
being mismatched by subdir. This is an implementation detail of the
module machinery, and not very useful to the end user. Instead, report
this case as a configuration mismatch when the compiler can confirm the
module cache was passed the same between the current TU & previously
compiled products.
Ideally, each argument that could result in this error would be uniquely
reported (e.g., O3), but as a starting point, providing something more
general is strictly better than pointing the user to the module cache.
This patch also includes NFCs for renaming variable names from Module to
AST and formatting cleanup in related areas.
resolves: rdar://167453135
|