Age | Commit message (Collapse) | Author | Files | Lines |
|
Created using spr 1.3.4
[skip ci]
|
|
(#100177)
added a check to remove '->' if exists
added testcase and modified Release Notes
|
|
|
|
(#101782)" (#102551)
|
|
When a function argument is annotated with the `llvm.byval` attribute,
[LLVM expects](https://llvm.org/docs/LangRef.html#parameter-attributes)
the function argument type to be an `llvm.ptr`. For example:
```
func.func (%args0 : llvm.ptr {llvm.byval = !llvm.struct<(i32)>} {
...
}
```
Unfortunately, this makes the type conversion context-dependent, which
is something that the type conversion infrastructure (i.e.,
`LLVMTypeConverter` in this particular case) doesn't support. For
example, we may want to convert `MyType` to `llvm.struct<(i32)>` in
general, but to an `llvm.ptr` type only when it's a function argument
passed by value.
To fix this problem, this PR changes the FuncToLLVM conversion logic to
generate an `llvm.ptr` when the function argument has a `llvm.byval`
attribute. An `llvm.load` is inserted into the function to retrieve the
value expected by the argument users.
|
|
Typo:
`chwon` --> `chown`
Signed-off-by: Peter Jung <admin@ptr1337.dev>
|
|
|
|
|
|
https://github.com/llvm/llvm-project/commit/da8778e499d8049ac68c2e152941a38ff2bc9fb2
breaks the lowering of vector.transpose that all the dimensions are unit
dimensions. The revision fixes the issue and adds a test.
---------
Signed-off-by: hanhanW <hanhan0912@gmail.com>
|
|
|
|
This reverts commit 967185eeb85abb77bd6b6cdd2b026d5c54b7d4f3.
The problem was link dependencies, moved `UseCtxProfile` to `Analysis`.
|
|
This adds support for the Arm NEON vector shift instructions that follow
the same pattern as x86 (handleVectorShiftIntrinsic).
VSLI is not supported because it does not follow the 2-argument pattern
expected by handleVectorShiftIntrinsic.
This patch also updates the arm64-vshift.ll MSan test that was
introduced in
https://github.com/llvm/llvm-project/commit/5d0a12d3e9b1606c36430cf908da20d19d101e04
|
|
|
|
(#100282)
Enables parallelization for the processing of DWO CUs.
|
|
|
|
Split from #100596.
Introduce the RealtimeSanitizer transform, which inserts the
rtsan_enter/exit functions at the appropriate places in an instrumented
function.
|
|
This reverts commit 28ba8a56b6fb9ec61897fa84369f46e43be94c03.
Reverting since this broke the buildbot at
https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/9352/.
|
|
Broke the build.
This reverts commit d46c26b8102dee763d72bf98341bc95b21767196.
|
|
|
|
The str_to_float conversion code doesn't need the features provided by
fenv and the dependency is creating a blocker for hand-in-hand. This
patch uses a workaround to remove this dependency.
|
|
Co-authored-by: OverMighty <its.overmighty@gmail.com>
|
|
This PR adds benchmarking for `sinf()` using the same set up as `sin()`
but with a smaller range for floats.
|
|
|
|
This reverts commit 1a6d60e0162b3ef767c87c95512dd453bf4f4746.
Broke some buildbots.
|
|
Useful with other infrastructure that consume LLVM statistics to get an
idea of distribution of section sizes.
The breakdown of various section types is subject to change, this is
just an initial go at gather some sort of stats.
Example stats compiling X86ISelLowering.cpp (-g1):
```
"elf-object-writer.AllocROBytes": 308268,
"elf-object-writer.AllocRWBytes": 6240,
"elf-object-writer.AllocTextBytes": 1659203,
"elf-object-writer.DebugBytes": 3180386,
"elf-object-writer.OtherBytes": 5862,
"elf-object-writer.RelocationBytes": 2623440,
"elf-object-writer.StrtabBytes": 228599,
"elf-object-writer.SymtabBytes": 120336,
"elf-object-writer.UnwindBytes": 85216,
```
|
|
(#102198)
support v_swap_b16 in true16 format.
update tableGen pattern and folding for v_mov_b16.
---------
Co-authored-by: guochen2 <guochen2@amd.com>
|
|
Didn't notice in #101338 that the instrumentation in `llvm/test/Transforms/PGOProfile/ctx-prof-use-prelink.ll` was actually incorrect.
|
|
The test fixture simplifies some of the logic for allocations and
mmap-based allocations
are separated from the cache to allow for more direct cache tests.
Additionally, a couple
of end to end tests for the cache and the LRU algorithm are added.
|
|
|
|
This was creating a new block to insert the is.shared check, but we
can just do that in the original block.
|
|
Previously, `AmdgpuSinTwoPow_128` and others were too large for their
table cells. This PR shortens the name to `AmdSin...`
There were also some `-` missing in the separator. This PR instead
creates the separator string using the length of the headers.
|
|
Reverts llvm/llvm-project#102400
Causes LLVM to crash on some tests.
|
|
- add_function now adds the function by alphabetical order
|
|
Add custom lowering for `BR_JT` DAG nodes to the `brx.idx` PTX
instruction ([PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx]
(https://docs.nvidia.com/cuda/parallel-thread-execution/#control-flow-instructions-brx-idx)).
Depending on the heuristics in DAG selection, `switch` statements may
now be lowered using `brx.idx`
|
|
Flang considers arrays in main program larger than 32 bytes having the
SAVE attribute and lowers them as globals. In CUDA Fortran, device
variables are not allowed to have the SAVE attribute and should be
allocated dynamically in the main program scope.
This patch updates lowering so CUDA Fortran device variables are not
considered with the SAVE attribute.
|
|
This PR implements
https://github.com/lntue/llvm-project/commit/2a158426d4b90ffaa3eaecc9bc10e5aed11f1bcf
to provide better throughput benchmarking for libc `sin()` and
`__nv_sin()`.
These changes have not been tested on AMDGPU yet, only compiled.
|
|
This patch is a follow-up to #97263 that fix ambigous abbreviated
command resolution.
When multiple commands are resolved, instead of failing to pick a
command to
run, this patch changes to resolution logic to check if there is a
single
alias match and if so, it will run the alias instead of the other
matches.
This has as a side-effect that we don't need to make aliases for every
substring of aliases to support abbrivated alias resolution.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
|
|
|
|
|
|
in TypeBit (#102481)
`TemplateTypeParmType` currently stores the depth, index, and whether a
template type parameter is a pack in a union of `CanonicalTTPTInfo` and
`TemplateTypeParmDecl*`, and only the canonical type stores the position
information. These bits can be stored for all `TemplateTypeParmTypes` in
`TypeBits` to avoid unnecessary indirection when accessing the position
information.
|
|
|
|
SLoadAddresses previously held data across different functions and used
these for dominance queries of blocks in different functions. This is
not intended; clear the state at the end of the pass.
|
|
Pre-enable this optimization before allowing folds of frame
indexes into add instructions. Disables this fold when using
scratch instructions for now. I see some code size improvements
with it, but the optimization needs to be smarter about the
uses depending on the register classes.
|
|
Follow-up to #98115. For EhInputSection, RelocationScanner::scan calls
sortRels, which doesn't support the CREL iterator. We should set
supportsCrel to false to ensure that the initial_location fields in
.eh_frame FDEs are relocated.
|
|
|
|
When a local character variable with non-constant length has an
initializer, it's an error in a couple of ways (SAVE variable with
unknown size, static initializer that isn't constant due to conversion
to an unknown length). The error that f18 reports is the latter, but the
message contains a formatted representation of the initialization
expression that exposes a non-Fortran %SET_LENGTH() operation. Print the
original expression in the message instead.
|
|
An I/O statement with IOMSG= but neither ERR= nor IOSTAT= deserves a
warning to the effect that it's not useful.
|
|
The check for a structure constructor to a forward-referenced derived
type wasn't tripping for constructors in the type definition itself. Set
the forward reference flag unconditionally at the beginning of name
resolution for the type definition.
|
|
FindPolymorphicAllocatableUltimateComponent needs to be
FindPolymorphicAllocatablePotentialComponent. The current search is
missing cases where a derived type has an allocatable component whose
type has a polymorphic allocatable component.
|
|
There's a numbered constraint that prohibits calls to some IEEE
arithmetic and exception procedures within the body of a DO CONCURRENT
construct. Clean up the implementation to catch missing cases.
|