Age | Commit message (Collapse) | Author | Files | Lines |
|
Created using spr 1.3.4
|
|
Vectors are always tightly packed, and elements of non-byte-sized
usually do not have a well-defined (byte) offset.
Fixes https://github.com/llvm/llvm-project/issues/90695.
|
|
When parsing a 3-component triple, after we determine Arch and Env, if
the middle component is "none", treat it as OS instead of Vendor.
See:
https://discourse.llvm.org/t/rfc-baremetal-target-triple-normalization/78524
Fixes: #89582.
|
|
#80923 newly enabled multivalue and reference-types in the generic CPU.
But enabling reference-types ended up breaking up Wasm's Chromium CI
(https://chromium-review.googlesource.com/c/emscripten-releases/+/5500231)
because the way the table index is encoded is different from MVP (u32)
vs. reference-types (LEB), which caused different encodings for
`call_indirect`.
And Chromium CI's and Emscripten's minimum required node version is v16,
which does not yet support reference-types, which does not recognize
that table index encoding. reference-types is first supported in node
v17.2.
We knew the current minimum required node for Emscripten (v16) did not
support reference-types, but thought it was fine because unless you
explicitly use `__funcref` or `__externref` things would be fine, and if
you want to use them explicitly, you would have a newer node. But it
turned out it also affected the encoding of `call_indirect`.
While we are discussing the potential solutions, I will disable
reference-types to unblock the rolls.
|
|
Some additional portability warnings now issue for a test; ensure that
they are expected.
|
|
|
|
This was added in def0823f1d2db78c4a18b15084407734a45e96c5 to fix a bot
failure, specifically for the standalone build. This does not seem to be
an issue anymore, and setting the policy explicitly to OLD causes
warnings with newer CMake versions.
This patch removes setting the policy to OLD to get rid of the warning.
|
|
This patch extends the `llvm::sys::RWMutex` class to fullfill the
`Lockable` requirement to include attempted locking, by implementing a
`bool try_lock` member function.
As the name suggests, this method will try to acquire to lock in a
non-blocking fashion and release it immediately. If it managed to
acquire the lock, returns `true`, otherwise returns `false`.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
|
|
Fix a warning from some build compilers about an unused lambda capture.
There are some `if constexpr` branches in the body of the code that do
not use the capture.
|
|
Use an index variable and array indexing instead of manipulating
a temporary StringRef.
|
|
As the list of profiles grow, this will be a more efficient lookup.
Because the profile name is a prefix of the Arch string, we use
upper_bound to find the first profile that definitely comes after the
Arch string. If that isn't the first supported profile, we move back 1
profile and see if that profile is a prefix of our Arch string.
|
|
Previously we used SpecificInt_match which created a 64 bit APInt
containing all ones. This was then checked against other constants by
using APInt::isSameValue.
If the constnats have different bitwidths, APInt::isSameValue will zero
extend the constant to make them match. This means for any constant less
than 64 bits, m_AllOnes was guaranteed to fail since the zero extended
value would not match all ones.
I think would also incorrectly consider an i128 with 64 leading zeros
and 64 trailing zeros as matching m_AllOnes.
To avoid this, this patch adds a new matcher class that just calls
isAllOnesOrAllOnesSplat.
|
|
This adds instrumenting callsites to PGOInstrumentation, *if* contextual profiling is requested. The latter also enables inserting counters in the entry basic block and disables value profiling (the latter is a point in time change)
This change adds the skeleton of the contextual profiling lowering pass, just so we can introduce the flag controlling that and the API to check that. The actual lowering pass will be introduced in a subsequent patch.
(Tracking Issue: #89287, RFC referenced there)
|
|
In rare cases `SplitBlockAndInsertSimpleForLoop` in `paintOrigin`
crashes outsize iterators.
Somehow existing `SplitBlockAndInsertIfThen` do not invalidate
iterators.
|
|
The hexagon-copy-hoisting.mir test fails when run with
-verify-machineinstrs. This patch fixes this by disabling
tracksRegLiveness.
|
|
(#90518)
…Warn()
Many warning messages were being emitted unconditionally. Ensure that
all warnings are conditional on a true result from a call to
common::LanguageFeatureControl::ShouldWarn() so that it is easy for a
driver to disable them all, or, in the future, to provide per-warning
control over them.
|
|
|
|
The transformational intrinsic functions MATMUL, DOT_PRODUCT, and NORM2
all involve summing up intermediate products into accumulators. In the
constant folding library, this is done with extended precision Kahan
summation for REAL and COMPLEX arguments, but in the runtime
implementations it is not, and this leads to discrepancies between
folded results and dynamic results.
Disable the use of Kahan summation in folding to resolve these
discrepancies, but don't discard the code, in case we want to add Kahan
summation in the runtime for some or all of these intrinsic functions.
|
|
|
|
When there are one or more fatal error messages produced by the parser,
semantic analysis is not performed. But when there are messages produced
by the parser and none of them are fatal, those messages are emitted to
the user before compilation continues with semantic analysis, and any
messages produced by semantics are emitted after the messages from
parsing.
This can be confusing for the user, as the messages may no longer all be
in source file location order. It also makes it difficult to write tests
that check for both non-fatal messages from parsing as well as messages
from semantics using inline CHECK: or other expected messages in test
source code.
This patch ensures that if semantic analysis is performed, and non-fatal
messages were produced by the parser, that all the messages will be
combined and emitted in source file order.
|
|
|
|
This patch renames Tosa Div Op to IntDiv Op to align with the TOSA Spec.
<!-- Reviewable:start -->
- - -
This change is [<img src="https://reviewable.io/review_button.svg"
height="34" align="absmiddle"
alt="Reviewable"/>](https://reviewable.io/reviews/llvm/llvm-project/80047)
<!-- Reviewable:end -->
Signed-off-by: Tai Ly <tai.ly@arm.com>
|
|
|
|
Use a Python script to generate SBLanguages.h instead of piggybacking on
LLDB TableGen. This addresses Nico Weber's post-commit feedback.
|
|
|
|
|
|
In some cases masked gather is less profitable than insert-subvector of
consecutive/strided stores. SLP has this kind of analysis, but need to
improve it by adding the cost of the GEP analysis.
Also, the GEP cost estimation for masked gather is fixed.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/90737
|
|
After minbitwidth analysis, and <v>, (power_of_2 - 1 const) can be
transformed into just an <v>, (all_ones const), which can be ignored at
the cost estimation and at the codegen. x264 benchmark has this pattern.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/90739
|
|
Replace relying on the underling CallInst for looking up the called
function and its types by instead adding the called function as operand,
in line with how called functions are handled in CallInst.
Operand bundles, metadata and fast-math flags are optionally used if
there's an underlying CallInst.
This enables creating VPWidenCallRecipes without requiring an underlying
IR instruction.
|
|
|
|
Declaration checking is looking for inappropriate usage of the INTENT,
VALUE, & OPTIONAL attributes in multiple places, and some oddball cases
like ENTRY points are not checked. Centralize the check for attributes
that apply only to dummy arguments into one spot.
|
|
A sanity CHECK() in mod-file.cpp needs to allow for USE association of a
derived type that has the same name as a locally defined generic
interface.
Fixes https://github.com/llvm/llvm-project/issues/90192.
|
|
The only reason for the removed condition was that there was a
dependency for `CTTestTidyModule` on the `clang-tidy-headers` target,
which was only created under the same `NOT LLVM_INSTALL_TOOLCHAIN_ONLY`
condition. It looks like this target is not needed for
`CTTestTidyModule` to build and run, so the dependency can be removed
along with the condition.
See also https://reviews.llvm.org/D111100 for earlier discussions.
|
|
zstd excels at scaling from low-ratio-very-fast to
high-ratio-pretty-slow. Some users prioritize speed and prefer disk read
speed, while others focus on achieving the highest compression ratio
possible, similar to traditional high-ratio codecs like LZMA.
Add an optional `level` to `--compress-sections` (#84855) to cater to
these diverse needs. While we initially aimed for a one-size-fits-all
approach, this no longer seems to work.
(https://richg42.blogspot.com/2015/11/the-lossless-decompression-pareto.html)
When --compress-debug-sections is used together, make
--compress-sections take precedence since --compress-sections is usually
more specific.
Remove the level distinction between -O/-O1 and -O2 for
--compress-debug-sections=zlib for a more consistent user experience.
Pull Request: https://github.com/llvm/llvm-project/pull/90567
|
|
The function may return Z_MEM_ERROR or Z_STREAM_ERR. The former does not
have a good way of testing. The latter will be possible with a pending
change that allows setting the compression level, which will come with a
test.
|
|
(#90414)
Treat a compound operator such as |=, array subscription, sizeof, and
non-type template parameter as trivial so long as subexpressions are
also trivial.
Also treat true/false boolean literal as trivial.
|
|
Fixes #84968.
Implements the `fcntl()` function defined in the `fcntl.h` header.
|
|
It was pointed out in post commit review of
https://github.com/llvm/llvm-project/pull/90597 that the pass should
never have been run in parallel over all functions (and now other top
level operations) in the first place. The mutex used in the pass was
ineffective at preventing races since each instance of the pass would
have a different mutex.
|
|
summaries" (#90610) (#90692)
This reverts commit 2aabfc811670beb843074c765c056fff4a7b443b.
Add fixes to LLD and Gold tests missed in original change.
Co-authored-by: Jan Voung <jvoung@google.com>
|
|
function. (#90665)
This simplifies the callers.
|
|
Instead of hardcoding the 4 current profile prefixes, treat profile
selection as a fallback if we don't find "rv32" or "rv64".
Update the error message accordingly.
|
|
ROTL and ROTR can take a shift amount larger than the element size, in
which case the effective shift amount should be the shift amount modulo
the element size.
This patch adds the modulo step when the shift amount isn't known at
compile time. Without it the existing implementation would end up
shifting beyond the type size and give incorrect results.
|
|
In case of functions without a stack frame no "stack" field is
serialized into MIR which leads to isCalleeSavedInfoValid being false
when reading a MIR file back in. To fix this we should serialize
MachineFrameInfo::isCalleeSavedInfoValid() into MIR.
|
|
|
|
6e31714d249f857f15262518327b0f0c9509db72
|
|
The dependency scanner only puts top-level affecting module map files on
the command line for explicitly building a module. This is done because
any affecting child module map files should be referenced by the
top-level one, meaning listing them explicitly does not have any meaning
and only makes the command lines longer.
However, a problem arises whenever the definition of an affecting module
lives in a module map that is not top-level. Considering the rules
explained above, such module map file would not make it to the command
line. That's why 83973cf157f7850eb133a4bbfa0f8b7958bad215 started
marking the parents of an affecting module map file as affecting too.
This way, the top-level file does make it into the command line.
This can be problematic, though. On macOS, for example, the Darwin
module lives in "/usr/include/Darwin.modulemap" one of many module map
files included by "/usr/include/module.modulemap". Reporting the parent
on the command line forces explicit builds to parse all the other module
map files included by it, which is not necessary and can get expensive
in terms of file system traffic.
This patch solves that performance issue by stopping marking parent
module map files as affecting, and marking module map files as top-level
whenever they are top-level among the set of affecting files, not among
the set of all known files. This means that the top-level
"/usr/include/module.modulemap" is now not marked as affecting and
"/usr/include/Darwin.modulemap" is.
|
|
The output on eel.is has similar oddities, so I expect this was copy
pasted.
|
|
Empty ISG1 and OSG1 parts are generated for compute shader since there's
no signature for compute shader.
Fixes #88778
|
|
emulation (#89131)
This PR builds on https://github.com/llvm/llvm-project/pull/79494 with an additional path for efficient unsigned `i4 ->i8` type extension for 1D/2D operations. This will impact any i4 -> i8/i16/i32/i64 unsigned extensions as well as sitofp i4 -> f8/f16/f32/f64.
|
|
I strongly suspect nobody ever used that macro since it wasn't very well
known. Furthermore, it only affects a handful of diagnostics and I think
it makes sense to either provide them unconditionally, or to not
provided them at all.
|