Age | Commit message (Collapse) | Author | Files | Lines |
|
SMLoc itself encapsulates just a pointer, so there is no need to pass or
return it by reference.
|
|
|
|
Fix line ending to Unix style by running dos2unix on this file.
|
|
|
|
Link BOLTUtils against the AArch64 target to support the new option
that enables instrumentation without LSE (see #158738)
This fixes shared library builds, eg:
https://lab.llvm.org/staging/#/builders/220/builds/1537
Note: the link points to a collapsing builder.
|
|
|
|
This patch consolidates two implementations of runOnNewStack with
"if constexpr".
|
|
StringTable::Iterator has a user-defined copy assignment operator, a
defaulted copy constructor, and a defaulted move constructor.
This patch makes the copy assignment operator defaulted and adds a
defaulted move assignment operator to adhere to the Rule of Five while
making the operators constexpr.
|
|
SmallPtrSetIterator and its base class SmallPtrSetIteratorImpl
collectively have the following responsibilities:
- type-safe user-facing iterator interface
- type-erased iterator increment/dereference core
- DebugEpochBase via inheritance
This patch refactors the two classes so that SmallPtrSetIteratorImpl
implements everything except the type-safe user-facing interface.
Benefits:
- DebugEpochBase::HandleBase is now part of the type-erased class.
- AdvanceIfNotValid is now private in SmallPtrSetIteratorImpl.
- SmallPtrSetIterator is a very thin wrapper around
SmallPtrSetIteratorImpl and should generate very little code on its
own.
|
|
This patch reduces code duplication by having allocateBuckets take a
larger role. Specifically, allocateBuckets now checks to see if we
need to allocate heap memory and initializes Small appropriately.
With this patch, allocateBuckets mirrors deallocateBuckets cleanly.
Both methods handle the Small mode without asserting and are
responsible for constructing and destructing LargeRep.
|
|
|
|
|
|
Fix a double assignment to a local variable and use the new
popToAPSInt() overload.
|
|
This PR changes `llvm::FileCollector` to use the `llvm::vfs::FileSystem`
API for making file paths absolute instead of using
`llvm::sys::fs::make_absolute()` directly. This matches the behavior of
the compiler on most other input files.
|
|
(#160294)
This is essentially the same patch as
116ca9522e89f1e4e02676b5bbe505e80c4d4933;
when trying to match a physreg hint, try to find a compatible physreg if
there is
a subregister copy. This has the slight difference of using getSubReg on
the hint
instead of getMatchingSuperReg (the other use should also use getSubReg
instead,
it's faster).
At the moment this turns out to have very little effect. The adjacent
code needs
better handling of subregisters, so continue adding this piecemeal. The
X86 test
shows a net reduction in real instructions, plus a few new kills.
|
|
In the important places. They are all fully covered switch statements so
we know where to add code when adding a new pointer type.
|
|
[nfc]" (#160897)
Reverts llvm/llvm-project#160765. Failures on buildbot indicate second
assertion does not in fact hold.
|
|
Uses the existing format of the LiveRange printer, and just factors it
out so that you can do vni->dump() when debugging, or log a vni in a
debug print statement.
|
|
FreeBSD libc has a lot of symbols that are ifuncs, which makes TLI
checker believe they are not available. This change makes the tool
consider symbols with the STT_GNU_IFUNC type.
|
|
|
|
Currently, RVV/SVE intrinsics are cached, but the corresponding type
construction is not. As a result, `ASTContext::getScalableVectorType`
can become a performance hotspot, since every query must run through a
long sequence of type checks and macro expansions.
|
|
We should always be able to find the VNInfo in the original live
interval which corresponds to the subset we're trying to spill, and the
only cases where we have a VNInfo without a definition instruction are
if the vni is unused, or corresponds to a phi. Adjust the code structure
to explicitly check for PHIDef, and assert the stronger conditions.
|
|
We didn't have trace logging for two cases in this routine which makes
it sometimes hard to tell what is going on. In addition to debug trace
statements, add comments to explain the logic behind the early exits
which don't mark the virtual register live. Suggestions on how to word
these more precisely very welcome; I'm not clear I understand all the
intrinicies of this code myself.
|
|
This heuristic was originally added in 40c4aa with the stated purpose of
avoiding global split on live long ranges created by MachineLICM
hoisting trivially rematerializable instructions. In the meantime,
various backends have introduced non-trivial rematerialization cases,
MachineLICM gained an explicitly triviality check, and we've reworked
our APIs to match naming wise. Let's move this heuristic back to truely
trivial remat only.
This is a functional change, though somewhat hard to hit. This change
will cause non-trivially rematerializable instructions to be globally
split more often. This is likely a good thing since non-trivial remat
may not be legal at all possible points in the live interval, but may
cost slightly more compile time.
I don't have a motivating example; I found it when reviewing the callers
of isRemMaterializable(MI).
|
|
arguments (#160755)
In #153973 I added the correctly handling of block arguments,
unfortunately this was gated on operation that also have results. This
wasn't intentional and this excluded operations like function from being
correctly processed.
|
|
It was brought up on a previous review that the CIRGenOpenACCRecipe.h
file was getting too large. I noticed that the 'dependent on template
argument' parts were actually quite small, so I extract a base class in
this patch that allows me to implement it in the .cpp file, plus
minimize the amount of code that needs instantiating.
|
|
## Problem
`RemoveDeadValues` can legally drop dead function arguments on private
`func.func` callees. But call-sites to such functions aren't fixed if
the call operation keeps its call arguments in a **segmented operand
group** (i.ie, uses `AttrSizedOperandSegments`), unless the call op
implements `getArgOperandsMutable` and the RDV pass actually uses it.
## Fix
When RDV decides to drop callee function args, it should, for each
call-site that implements `CallOpInterface`, **shrink the call's
argument segment** via `getArgOperandsMutable()` using the same dead-arg
indices. This keeps both the flat operand list and the
`operand_segment_sizes` attribute in sync (that's what
`MutableOperandRange` does when bound to the segment).
## Note
This change is a no-op for:
* call ops without segment operands (they still get their flat operands
erased via the generic path)
* call ops whose calle args weren't dropped (public, external,
non-`func-func`, unresolved symbol, etc)
* `llvm.call`/`llvm.invoke` (RDV doesn't drop `llvm.func` args
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
|
|
## Details:
- Added missing compound assignment operators `|=`, `&=`, `^=` to
`mlir-tblgen`
- Replaced the arithmetic operators with added assignment operators for
`BitEnum` in the transformations
- Updated related documentation
## Tickets:
- Closes https://github.com/llvm/llvm-project/issues/158098
|
|
Summary:
This unifies the interface to just be a bunch of `load` and `store`
functions that optionally accept a mask / indices for gathers and
scatters with masks.
I had to rename this from `load` and `store` because it conflicts with
the other version in `op_generic`. I might just work around that with a
trait instead.
|
|
Clang on Darwin enables non-POSIX extensions by default.
This causes some macros to leak, such as HUGE from <math.h>, which
causes some conflicts with Flang symbols (but not with Flang-RT, for
now).
It also causes some Flang-RT extensions to be disabled, such as FDATE,
that checks for _POSIX_C_SOURCE. Setting _POSIX_C_SOURCE avoids these
issues. This is already being done in Flang, but it was not ported to
Flang-RT.
This also fixes check-flang-rt on Darwin, as NoArgv.FdateNotSupported
is broken since the flang runtime was moved to flang-rt.
Fixes #82036
|
|
(#160740)
Currently `memcpy` and `memset` intrinsics map through to the library
implementations if ASan has been inited, whereas `memmove` always calls
`internal_memmove`.
This patch changes `memmove` to use the library implementation if ASan
has been inited.
|
|
|
|
Enable the generation of no-loop kernels for Fortran OpenMP code. target
teams distribute parallel do pragmas can be promoted to no-loop kernels
if the user adds the -fopenmp-assume-teams-oversubscription and
-fopenmp-assume-threads-oversubscription flags.
If the OpenMP kernel contains reduction or num_teams clauses, it is not
promoted to no-loop mode.
The global OpenMP device RTL oversubscription flags no longer force
no-loop code generation for Fortran.
|
|
That provides vastly better plots.
|
|
canCreateUndefOrPoisonForTargetNode/isGuaranteedNotToBeUndefOrPoisonForTargetNode - add X86ISD::VPERMILPV handling (#160849)
X86ISD::VPERMILPV shuffles can't create undef/poison itself, allowing us to fold freeze(vpermilps(x,y)) -> vpermilps(freeze(x),freeze(y))
|
|
insertps(freeze(x),freeze(y),i) (#160852)
|
|
Mostly mechanical changes to add the missing field.
|
|
llvm.convert/to.fp16 and from.fp16 are no longer used / deprecated and do not
need to be tested any more.
|
|
canCreateUndefOrPoisonForTargetNode/isGuaranteedNotToBeUndefOrPoisonForTargetNode - add X86ISD::VPERMV handling (#160845)
X86ISD::VPERMV shuffles can't create undef/poison itself, allowing us to fold freeze(vpermps(x,y)) -> vpermps(freeze(x),freeze(y))
|
|
(#160605)
When cross-compiling the LLVM project as a whole (from llvm/), if it
cannot find presupplied tools it will create a native build environment
to build the tools it needs.
However, when doing a standalone build of clang (that is, from clang/
and linking against an existing libLLVM) this doesn't work. Instead a
_target_ binary is built which predictably then fails.
The conventional workaround for this is to build the native tools in a
separate native compile phase and pass the paths to the cross build, for
example see OpenEmbedded[1] or Nix[2]. But we can do better!
The first problem is that LLVM_USE_HOST_TOOLS is only set in the llvm/
CMakeLists.txt, so setup_host_tool() will never consider building a
native binary. This can be solved by setting LLVM_USE_HOST_TOOLS based
on CMAKE_CROSSCOMPILING in clang/CMakeLists.txt in the standalone case.
Now setup_host_tool() will try to build a native tool, but it needs
build_native_tool() from CrossCompile.cmake, so that also needs to be
included.
Finally, the native binary then fails because there's no provider for
the dependency "CONFIGURE_Clang_NATIVE", so use llvm_create_cross_target
to create the native environment.
These few lines mirror what the lldb CMakeLists.txt does in the
standalone case, so there is prior art for this.
[1]
https://git.openembedded.org/openembedded-core/tree/meta/recipes-devtools/clang/clang_git.bb?id=e18d697e92b55e57124e80234369d46575226386#n212
[2]
https://github.com/NixOS/nixpkgs/blob/3354d448f2a26117a74638957b0131ce3da9c8c4/pkgs/development/compilers/llvm/common/tblgen.nix#L54
|
|
Program itself is unused in that file, so just include the needed
headers.
|
|
On targets where f32 maximumnum is legal, but maximumnum on vectors of
smaller types is not legal (e.g. v2f16), try unrolling the vector first
as part of the expansion.
Only fall back to expanding the full maximumnum computation into
compares + selects if maximumnum on the scalar element type cannot be
supported.
|
|
canCreateUndefOrPoisonForTargetNode/isGuaranteedNotToBeUndefOrPoisonForTargetNode - add X86ISD::PSHUFB handling (#160842)
X86ISD::PSHUFB shuffles can't create undef/poison itself, allowing us to fold freeze(pshufb(x,y)) -> pshufb(freeze(x),freeze(y))
|
|
`_LIBCPP_VERSION` (#160627)
And add some guaranteed cases (namely, for `expected`, `optional`, and
`variant`) to `is_implicit_lifetime.pass.cpp`.
It's somehow unfortunate that `pair` and `tuple` are not guaranteed to
propagate triviality of copy/move constructors, and MSVC STL fails to do
so due to ABI compatibility. This affects the implicit-lifetime
property.
|
|
This patch makes the following updates to the `QualGroup` documentation:
✅ 1. Move to Reference section
Relocated the Qualification Working Group (QualGroup) docs from the main
index into the Reference section for better organization and
consistency.
✅ 2. Add link in GettingInvolved
Inserted a proper link to the QualGroup documentation in the
GettingInvolved sync-ups table, improving discoverability for newcomers.
✅ 3. Align structure with Security Group
Revised the documentation layout to follow the same structure pattern as
the Security Group docs, ensuring consistency across LLVM working group
references.
|
|
This flags enables the compiler to generate most of the debug
information in a separate file which can be useful for executable size
and link times. Clang already supports this flag.
I have tried to follow the logic of the clang implementation where
possible. Some functions were moved where they could be used by both
clang and flang. The `addOtherOptions` was renamed to `addDebugOptions`
to better reflect its purpose.
Clang also set the `splitDebugFilename` field of the `DICompileUnit` in
the IR when this option is present. That part is currently missing from
this patch and will come in a follow-up PR.
|
|
permilvar(freeze(x),freeze(y)) (#160836)
|
|
vpermps(freeze(x),freeze(y)) (#160837)
|
|
pshufb(freeze(x),freeze(y)) (#160835)
|
|
Additional CSE opportunities are exposed after converting to concrete
recipes/dissolving regions and materializing various expressions. Run
CSE later, to capitalize on some of the late opportunities.
PR: https://github.com/llvm/llvm-project/pull/160572
|