| Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
This test currently fails due to insufficient
registers during allocation. Once the subreg
reload is implemented, it will begin to pass
as the partial reload help mitigate register
pressure.
|
|
Putting back the functions that are recently deleted
as they were found unused. They are needed for
implementing subreg reload during RA.
|
|
Instead of just cloning the virtual register, this
function now creates a new virtual register derived
from a subregister class of the original value.
|
|
|
|
AMDGPURewriteAGPRCopyMFMA pass is currently not subreg-aware.
In particular, the logic that optimizes spills into COPY
instructions assumes full register reloads. This becomes
problematic when the reload instruction partially restores
a tuple register. This patch introduces the necessary changes
to make this pass subreg-aware, for a future patch that
implements subreg reload during RA.
|
|
|
|
|
|
|
|
|
|
Currently, SGPR spill pseudo-instructions lack
an offset field to represent non-zero stack offsets.
This patch introduces an additional offset field to
SGPR spill pseudo-instructions and updates all
relevant passes that handle spill lowering to support
this new field. This field is essential for a future
patch that implements subreg reload of tuple registers
from their stack location during RA.
|
|
subreg-reload (#175581)
This preparatory patch introduces an additional argument to the target hook
loadRegFromStackSlot. Ths is essential for targets to handle subregister-specific
reload in the future. See how this is used for AMDGPU target with PR #175002.
|
|
|
|
(#174938)
When locating the join blocks of a divergent block, the algorithm relies
on pseudo-edges from the header of a reducible cycle to the cycle exits.
This was missed in the actual traversal, producing unnecessary joins
inside the reducible cycle. This caused an assert in the included test,
which expected that if a join existed in a reducible cycle for a
divergent branch outside the cycle, then it must be header.
This fixes the reverted commit from #174117
|
|
used. (#175648)
When `-gpu=mem:managed` is used, allocatable arrays without explicit
CUDA data attributes are implicitly treated as managed. The
`-gpu=mem:managed` flag to enable this feature is currently only
supported in `bbc`.
|
|
Renamed a function as suggested in #175664.
|
|
|
|
(#175646)
We split vector floating point FMA (pseudo) instructions' opcodes by SEW
since c6b7944be4dfbb1fb35301c670812726845acaa7 , but forgot to populate
their `SEW` field, which is used by various search tables. This results
in incorrect pseudo instruction opcodes lookup -- and to a larger
extent, incorrect scheduling class lookups -- in llvm-mca. This patch
fixes such issue.
|
|
Currently, we report a fatal error if the user leaves CMAKE_BUILD_TYPE
blank. This was implemented in https://reviews.llvm.org/D124153 /
350bdf9227ceb , based on this RFC:
https://discourse.llvm.org/t/rfc-select-a-better-linker-by-default-or-warn-about-using-bfd/61899/1
Tom Stellard mentioned that he'd like to revisit this on Discord, and
Aiden, myself, and apparently most people on the original RFC agree, so
I'm proposing we do it. However, on the review, several folks objected
and insisted that Debug was a better default. I want to reopen the
question.
I think we've made the wrong tradeoff. I wish Debug builds worked out of
the box on most systems, but they don't, and LLVM has only gotten bigger
over the last four years, making the build scalability problems of Debug
builds worse. I think we should optimize our build configuration for new
developers, not experienced longtime contributors who are invested
enough to tweak the build to their liking.
With this PR, we emit a warning, and set the build type to Release,
which has a higher likelihood of success for first-time users. Making
the build work out of the box is very important for making LLVM
development more accessible to new contributors, so it seems worth
smoothing over this rough edge.
A separate possible improvement would be to set
LLVM_ENABLE_ASSERTIONS=ON, but that is out of scope for this PR.
|
|
This follows #171255 , removing the cleanup line.
|
|
We currently encode an estimated trip count of 0 as the latch having branch probabilities 0-0. That's an invalid pair of weights. The probability of a branch is computed as a fraction of its corresponding weight and the sum of the weights. In fact, `BranchProbabilityInfo::calcMetadataWeights` will convert this to a 1-1, meaning 50% - 50%, which isn't quite what we want. To indicate the loop is never taken, we just need to initialize the exit probability to non-zero (hence, 1)
Related: https://reviews.llvm.org/D67905
Issue #147390
|
|
|
|
|
|
(#175572)
We only need two uses in Xqcilo load/store instructions for the base
adjustment to be profitable as compared to three uses in the base
load/store instructions.
|
|
adds support for the
`__builtin_ia32_vp2intersect_d`/`__builtin_ia32_vp2intersect_q` x86
builtins.
Part of #167765
---------
Signed-off-by: vishruth-thimmaiah <vishruththimmaiah@gmail.com>
|
|
`clang/test/CodeGen/AMDGPU/nullptr-in-different-address-spaces.cpp`
|
|
Different build environments are picking up warnings that my testing
didn't expose; turn -Werror back off.
(And also delete an unused data member that was triggering some MSVC
warnings.)
|
|
(#153595)
We are currently only using `PseudoRV32ZdinxSD/LD` for spills and
reloads when the register class is `GPRPairRegClass` . However, we can
use `LD_RV32/SD_RV32` when the `Zilsd` extension is enabled and certain
alignment requirements are met.
|
|
|
|
(#175685)
There is an issue in certain versions of LD which causes the wrong
libLTO to be used if the DYLD_LIBRARY_PATH is not normalized.
Will fix these failures:
```
AddressSanitizer-x86_64-darwin.TestCases/Darwin.odr-lto.cpp
AddressSanitizer-x86_64h-darwin.TestCases/Darwin.odr-lto.cpp
```
https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-cmake-RA-incremental/13428/
rdar://168024431
|
|
(#175684)
PR https://github.com/llvm/llvm-project/pull/175383 had breaking test
Semantics/OpenMP/linear-clause01.f90
I disabled problematic part of the test for now to let the builds pass.
I will file the issue for PR author to fix the test.
|
|
Running clang-tidy on CUDA files without specifying `--cuda-host-only`
or `--cuda-device-only` would trigger an assertion failure in
`Actions.size() > 1`, a related discussion:
https://github.com/llvm/llvm-project/pull/173699#discussion_r2649279975.
This occurred because the Clang Driver generates a single top-level
`OffloadAction` in `-fsyntax-only`, `-E`, `-M`. This commit removes the
overly strict assertions.
Closes #173777
|
|
On some systems, backtraces contain addresses with their high bits set*.
These high bits prevent symbolication using the JIT symbol table. Since
this test is for a best-effort debugging / diagnosis tool it seems best
to remove the test until/unless we can get it passing on all systems, or
find some way to identify systems that will fail.
See discussion in https://github.com/llvm/llvm-project/pull/175537.
* Note that the test does not use PAC or pointer tagging -- the high
bits are coming from somewhere else. Possibly libunwind, but that is
just speculation.
|
|
We had been doing this manually, and this will be a lot more convenient.
|
|
The compiler emits "!need$" lines to module files only for modules
needed by the module's outermost scope, but misses dependences on other
modules that might be USE'd in inner scopes.
Fixes https://github.com/llvm/llvm-project/issues/175611.
|
|
There are six instances of Kahan's extended precision summation
algorithm in flang/flang-rt, and they share a bug: the calculation of
the correction value produces a Nan due to the subtraction Inf-Inf after
the accumulation saturates to Inf. This leads to the surprising Nan
result from SUM([Inf, 0.]).
This bug doesn't affect run-time calculation of SUM when optimization is
enabled -- lowering emits an open-coded SUM that lacks Kahan summation
-- but it does affect compilation-time folding and -O0 runtime results.
Fix the one instance of Kahan summation in the runtime, and consolidate
the other five instances in Evaluate into one new member function, also
corrected.
Fixes https://github.com/llvm/llvm-project/issues/89528.
|
|
NAMELIST has no useful purpose in an interface block, but it's allowed.
Fix a crash due to our deferred handling of NAMELIST groups in the
execution part (which doesn't exist in an interface block).
Fixes https://github.com/llvm/llvm-project/issues/175207.
|
|
(#175071)
Add language to flang/docs/Extensions.md to explain why "A+(B*C)" must
round the result of the multiplication, when REAL and the -ffast-math
option is not used.
|
|
Add (void) uses of two parameters to dodge a C++ compiler warning that
has broken -Werror builds of flang since 9-28-25, and restore that
option as the default for flang builds.
|
|
TypeSystemClang::IsIntegerType (#175669)
Instead of re-implementing `Type::isIntegralType`, call it explicitly.
This means we get support for `BitIntType` out-of-the-box.
We don't use `IsIntegerType` here because we want to abide by the
language-specific notions of an integer type (which differ between C++
and C).
The slight behaviour change here is that `IsIntegerType` will now treat
complete enumerations as integers in C. This is correct according to the
C standard.
|
|
This extends the UnaryOp folder to handle plus, minus, and not
operations on constant operands.
This is in preparation for a change that will attempt to fold these
unary operations as they are generated, but this change only performs
the folding via the cir-canonicalize pass.
|
|
Introduced in commit d28daddd. `IsMulH` is only used in assert(), and
triggers unused variable warnings in non-debug builds.
|
|
Standard pass porting. Used callbacks to get MLI so we do not compute it
in the common case where we have no AMX registers. Also moved the call
to releaseMemory into runOnMachineFunction through a scope exit rather
than calling it through the pass manager so we can get consistent
behavior across both PMs. No test coverage added in this one as we also
need x86-tile-config to be able to run any tests.
|
|
Build has been broken when OMPTARGET_DEBUG is undefined.
|
|
Warned on PRs that happened to touch nearby lines.
|
|
implementation (#175560)
We were calling into `IsIntegerType` to determine the signedness of the
enum. Calling the relevant `clang::Type` API is simpler.
This shouldn't have any observable behaviour change.
We were lacking unit-test coverage for this. Added some tests that pass
before and after this change.
|
|
Merge cases calling the same helper, as suggested post-commit in
https://github.com/llvm/llvm-project/pull/174234
|
|
Standard porting. Use callbacks to get the needed analyses to make the
pass portable between Legacy/New PMs and to prevent computing anything
if we do not have any AMX registers in the function. No test coverage
for now as amx-greedy-ra.ll is the only test that references this pass
and needs pass pipeline setup in order to work which I plan on getting
to this week.
|
|
Currently, `SymbolFileNativePDB` calls several `PdbAstBuilder` methods
for side-effects to ensure the AST is populated.
This change adds new void-returning methods for `SymbolFileNativePDB` to
use as a hook instead, so that it doesn't depend on Clang-specific parts
of `PdbAstBuilder`'s interface.
This is part of the work to allow language-agnostic `PdbAstBuilder` (see
RFC:
https://discourse.llvm.org/t/rfc-lldb-make-pdbastbuilder-language-agnostic/89117)
|
|
LSan, by design, can have false negatives, making it unreliable to check
that the leak was found in the stack-allocated case:
```
==123685==Scanning STACK range 0x7ffe6e554ca0-0x7ffe6e557000.
==123685==0x7ffe6e554de0: found 0x51e0000009f0 pointing into chunk 0x51e000000000-0x51e000000c00 of size 3072.
==123685==0x7ffe6e554e30: found 0x51e000000c00 pointing into chunk 0x51e000000c00-0x51e000001668 of size 2664. <- this prevented the leak from being found
```
This has led to flakiness on the buildbots e.g.,
https://lab.llvm.org/buildbot/#/builders/66/builds/24669
```
# | /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/lsan/TestCases/swapcontext.cpp:44:11: error: CHECK: expected string not found in input
# | // CHECK: SUMMARY: {{.*}}Sanitizer: 2664 byte(s) leaked in 1 allocation(s)
...
Failed Tests (2):
LeakSanitizer-HWAddressSanitizer-x86_64 :: TestCases/swapcontext.cpp
LeakSanitizer-Standalone-x86_64 :: TestCases/swapcontext.cpp
```
This patch fixes the issue by clearing the buffer, as suggested by
Vitaly.
|