Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Avoids more Triple->string->Triple round trip. This
is a continuation of f137c3d592e96330e450a8fd63ef7e8877fc1908
|
|
(#156735)
Reverts llvm/llvm-project#156300
Need to look at the X86 test failures.
|
|
In the serial snippet generator and function that computes the aliasing
instructions, we don't want to include load/store instructions
to create a chain as that could make the results more unreliable.
There is a hasMemoryOperands() check, but currently that looks like a X86
way for checking for loads/stores. For AArch64 and other architectures, we
should check mayLoad() and mayStore().
|
|
by snippet generator" (#156423)
### Reland #142529 (Resolving "not all operands are initialized by
snippet generator")
Introduced changes in implementation of `randomizeTargetMCOperand()` for
AArch64 that omitting `OPERAND_SHIFT_MSL`, `OPERAND_PCREL` to an
immediate value of 264 and 8 respectively.
PS: Omitting
`MCOI::OPERAND_FIRST_TARGET/llvm:AArch64:OPERAND_IMPLICIT_IMM_0`
similarly, to value 0. It was low hanging change thus added in this PR
only.
For any future operand type of AArch64 if not initialised will exit with
error "`Unimplemented operand type: MCOI::OperandType:<#Number>`".
#### [Reland Updates]
Updated `tools/llvm-exegesis/AArch64/error-resolution.s` which caused
problem.
Test case was failing when there is uninitialised operands error coming
from secondary/consumer instruction used by exegesis in latency mode
required to chain up the assembly to ensure serial execution.
i.e. We get error message like `UMOVvi16_idx0: Not all operands were
initialized by the snippet generator for <<<any opcode other than
UMOVvi16_idx0>>> opcode.` but test case want to check like
`# UMOVvi16_idx0_latency: ---`. Thus the testcase fails.
```+ /llvm-project/build/bin/FileCheck /llvm-project/llvm/test/tools/llvm-exegesis/AArch64/error-resolution.s --check-prefix=UMOVvi16_idx0_latency
/llvm-project/llvm/test/tools/llvm-exegesis/AArch64/error-resolution.s:65:26: error: UMOVvi16_idx0_latency: expected string not found in input
# UMOVvi16_idx0_latency: ---
^
<stdin>:1:1: note: scanning from here
UMOVvi16_idx0: Not all operands were initialized by the snippet generator for LD1W_D_IMM opcode.
^
Input file: <stdin>
Check file: /llvm-project/llvm/test/tools/llvm-exegesis/AArch64/error-resolution.s
-dump-input=help explains the following input dump.
Input was:
<<<<<<
1: UMOVvi16_idx0: Not all operands were initialized by the snippet generator for LD1W_D_IMM opcode.
check:65 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
>>>>>>
--
********************
********************
Failed Tests (1):
LLVM :: tools/llvm-exegesis/AArch64/error-resolution.s
```
#### [Why it fails (only sometimes)]
Exegesis in latency mode require the generated assembly to be chained to
ensure serial execution,
For this exegesis add an additional consumer instruction for some
instruction, which is chosen via a random seed.
Thus, it randomly fails whenever there is secondary consumer instruction
(which is unsupported/throws error) added in generated assembly.
|
|
initialized by snippet generator" (#142529)"
This reverts commit 2e8ecf7d5fbb4e0d029b0baf94f57f8161b396be.
It is causing clang-aarch64-quick buildbot to fail.
(see:https://lab.llvm.org/buildbot/#/builders/65/builds/22035)
|
|
snippet generator" (#142529)
Exegesis for AArch64 arch, before this patch only handles/initialise
Immediate and Register Operands.
For opcodes requiring rest operand types exegesis exits with Error:
`"not all operands are initialised by snippet generator"`.
To resolve a given error we have to initialise required operand types.
i.e., For `"not all operands are initialised by snippet generator"` init
`OPERAND_SHIFT_MSL`, `OPERAND_PCREL`,
And For `"targets with target-specific operands should implement this"`
init `OPERAND_FIRST_TARGET`.
This PR adds support to the following opcodes:-
- OPERAND_SHIFT_MSL: `[MOVI|MVNI]_[2s|4s]_msl`.
- OPERAND_PCREL: `LDR[R|X|W|SW|D|S|Q]l`
- OPERAND_IMPLICIT_IMM_0:
`[UMOV|SMOV]v[i8|i16|i32|i64|i8to32|i8to64|i16to32|i32to64|i16to64]_idx0`
---
### [Experiment/Learnings]
Moreover, We found out we can similarly omit `OPERAND_UNKNOWN` with
immediate value of 0. This brute force fix helps us get major part of
(`~1000`) opcodes which throw un-init operands error. But, The correct
way to resolve is to introduce `OperandType` in AArch64 tablegen files
for opcode which have `OPERND_UNKNOWN` in `AArch64GenInstrInfo.inc`. And
add switch case for those `OperandType` in the
`randomizeTargetMCOperand()`.
As, side-effect to this temporary fix that we explored is listed below
system-level instructions throws `illegal instruction` i.e. for`MRS,
MSR, MSRpstatesvcrImm1, SYSLxt, SYSxt, UDF`.
This patch in `--mode=inverse_throughput` for opcodes `MRS, MSR,
MSRpstatesvcrImm1, SYSLxt, SYSxt, UDF` exits with handled error of
`snippet crashed while running: Illegal instruction`, they previously
used to exits with error `not all operands initialized by snippet
generator`.
[For completeness] Additionally, exegesis beforehand and with this patch
too, throws illegal instruction in throughput mode, for these opcodes
too (`APAS, DCPS1, DCPS2, DCPS3, HLT, HVC, SMC, STGM, STZGM`). Will look
into them later.
---
### [Summary]
Thus, Only introduced changes in implementation of
`randomizeTargetMCOperand()` for AArch64 that omitting
`OPERAND_SHIFT_MSL`, `OPERAND_PCREL` to an immediate value of 264 and 8
respectively.
PS: Omitting
`MCOI::OPERAND_FIRST_TARGET`/`llvm:AArch64:OPERAND_IMPLICIT_IMM_0`
similarly, to value 0. It was low hanging change thus added in this PR
only.
For any future operand type of AArch64 if not initialised will exit with
error `"Unimplemented operand type: MCOI::OperandType:<#Number>"`.
|
|
(#155423) (#155589)
This includes two minor fixes:
- "Codegen" has been added to the LLVM_LINK_COMPONENTS for AArch64 to
prevent a link error,
- the test case has been made less strict or fragile by not checking the
addresses.
|
|
(#155423)
I see some build bot failures:
- https://lab.llvm.org/buildbot/#/builders/76/builds/12434/
- https://lab.llvm.org/buildbot/#/builders/55/builds/16251/
Revert llvm/llvm-project#154751 while I investigate this.
|
|
Subject says it all: implement the loop iterator decrement and jump
function functions, and reserve X19 for the loop counter.
|
|
This patch fixes:
llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp:602:6: error:
unused function 'printInstructions' [-Werror,-Wunused-function]
|
|
Debug generated disassembly by passing argument
`debug-only="print-gen-assembly"` or `debug-only=preview-gen-assembly`
of exegesis call.
`--debug-only="print-gen-assembly"` debugs the whole generated assembly
snippet .
`--debug-only=preview-gen-assembly` debugs the initial 10 instructions
and ending 3 lines.
Thus, We can in glance see the initial setup code like registers setup
and instruction followed by truncated middle and finally print out the
last 3 instructions.
This helps us look into assembly that exegesis is execution in hardware,
Thus, it is simply functionally alias to separate objdump command on the
dumped object file.
|
|
These changes allow LLVM and Clang to be built with Clang targeting
Arm64EC using the MSVC linker.
Built with these options:
```
-DLLVM_ENABLE_PROJECTS="clang"
-DLLVM_HOST_TRIPLE=arm64ec-pc-windows-msvc
-DCMAKE_C_COMPILER=clang-cl.exe
-DCMAKE_C_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_CXX_COMPILER=clang-cl.exe
-DCMAKE_CXX_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_LINKER_TYPE=MSVC
```
|
|
We encountered the index of operand out of bounds crash because FCVT_D_W
lacks frm operand.
|
|
* Rename the vague `Value` to `Fill`.
* FillLen is at most 8. Making the field smaller to facilitate encoding
MCAlignFragment as a MCFragment union member.
* Replace an unreachable report_fatal_error with assert.
|
|
We should not include both linux/prctl.h and sys/prctl.h. This works
with glibc because the latter includes the former, but breaks with musl
because the latter redeclares the contents of the former, resulting in:
```
/usr/local/aarch64-linux-musl/include/sys/prctl.h:88:8: error:
redefinition of 'struct prctl_mm_map'
88 | struct prctl_mm_map {
| ^~~~~~~~~~~~
In file included from
/checkout/src/llvm-project/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp:13:
/usr/local/aarch64-linux-musl/include/linux/prctl.h:134:8: note:
previous definition of 'struct prctl_mm_map'
134 | struct prctl_mm_map {
| ^~~~~~~~~~~~
```
Fixes https://github.com/llvm/llvm-project/issues/139443.
|
|
Note that llvm::interleaved constructs a string with the elements from
a given range with a given separator.
|
|
Dynamic relocations are expensive on ELF/Linux platforms because they
are applied in userspace on process startup. Therefore, it is worth
optimizing them to make PIE and PIC dylib builds faster. In +asserts
builds (non-NDEBUG), nikic identified these schedule class name string
pointers as the leading source of dynamic relocations. [1]
This change uses llvm::StringTable and the StringToOffsetTable TableGen
helper to turn the string pointers into 32-bit offsets into a separate
character array.
The number of dynamic relocations is reduced by ~60%:
❯ llvm-readelf --dyn-relocations lib/libLLVM.so | wc -l
381376 # before
155156 # after
The test suite time is modestly affected, but I'm running on a shared
noisy workstation VM with a ton of cores:
https://gist.github.com/rnk/f38882c2fe2e63d0eb58b8fffeab69de
Testing Time: 100.88s # before
Testing Time: 78.50s. # after
Testing Time: 96.25s. # before again
I haven't used any fancy hyperfine/denoising tools, but I think the
result is clearly visible and we should ship it.
[1] https://gist.github.com/nikic/554f0a544ca15d5219788f1030f78c5a
|
|
== 0 (#143840)
This allows llvm-exegesis to skip instructions that lack scheduling
information, avoiding invalid benchmarking. e.g. `InstB` in RISC-V.
|
|
instructions fixed (#136868)" (#142382)
This reverts commit 475531b884a1a203af6367df35f1722fe2383e06. But it
does not revert the changes from commits
36850a028d149467cafd2702bc0c2587f6b71cce and
5dc3cd0ee40c00d9fb542488fa5a54ff70273112.
|
|
MCAsmLexer.h has been made a forwarder header since #134207
|
|
|
|
|
|
This patch makes llvm-exegesis emit an error when the machine function
fails in MachineVerification rather than aborting. This allows
downstream users (particularly https://github.com/google/gematria) to
handle these errors rather than having the entire process crash. This
essentially be NFC from the user perspective minus the addition of the
new error message.
|
|
|
|
This is a follow up of 3beacfa022da2a9c94e012e25bed89e8e4867ac2, which
added the PR_PAC_APIAKEY macro to resolve the build failures on older Linux
distros. However, it missed a few other definitions. This patch fixes
this issue.
|
|
In older Linux distros, PR_PAC_APIAKEY is not defined and it causes
build failures on linux arm64 platform. This patch adds the definition
of this macro if it is not defined.
|
|
fixed (#136868)
[llvm-exegesis][AArch64] Recommit: Disable pauth and ldgm as unsupported instructions.
Skipping AUT and LDGM opcode variants which currently throws "illegal
instruction".
Last pull request
[#132346](https://github.com/llvm/llvm-project/pull/132346) got reviewed
and merged but builder bot got failed. This was due to undefined
`PR_PAC_SET_ENABLED_KEYS` utilized were not defined in x86 arch,
resulting in build failure.
This is followup to merge the changes with following changes to fixup
the build failure.
Changes:
- Fixed up the problem with arch specific check for `prctl` library
import
- Defining `PR_PAC_SET_ENABLED_KEYS` if undefined.
|
|
(#134971)
…d instructions (#132346)"
This reverts commit 559540dc2738af0ab3f0b48eb4993095b8a8c627 as it has
cause build failures in llvm-clang-x86_64-gcc-ubuntu
|
|
(#132346)
[llvm-exegesis][AArch64] Disable pauth and ldgm as unsupported
instructions.
Skipping AUT and LDGM opcode variants which currently throws "illegal
instruction".
- Checking opcodes specifically for LDGM and AUT opcode instruction
variants.
- Gracefully exiting with " : Unsupported opcode:
isPointerAuth/isUncheckedAccess"
- Added corresponding test cases to check exit message.
|
|
Perf counters can be multiplexed if there are too many that need to be
scheduled on a core at the same time (and they exceed the available
PMUs). Other processes (especially system ones in certain environments,
not commonly on Desktop Linux from what I've seen) can also interfere.
This will impact the measurement fidelity as the counter is not actually
counting cycles/uops the entire time. This patch makes it so that we
error out in these cases so the user gets a visible indication things
have gone wrong rather than things failing silently.
|
|
|
|
|
|
|
|
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
|
|
Until recently checkFeature was quite slow. #130936
I was curious where we use checkFeature and noticed these. I thought we
could use hasFeature instead of going through strings.
|
|
This avoids doing a Triple -> std::string -> Triple round trip in lots
of places, now that the Module stores a Triple.
|
|
Current implementation (for AArch64) only supports the GRP32, GPR64,
FPR64/128, PPR16 and ZPR128 register classes. This adds support for
the other floating point register classes to initialize registers and avoid
the "setReg is not implemented" warning for these cases.
|
|
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.
For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.
The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
|
|
and ZPR128 reg class. (#127564)
Current implementation (for Aarch64) in llvm-exegesis only supports
GRP32 and GPR64 bit register class, thus for opcodes variants which used
FPR64/128, PPR16 and ZPR128, llvm-exegesis throws warning "setReg is not
implemented". This code will handle the above register class and
initialize the registers using appropriate base instruction class.
|
|
This fix helps to map operand memory to destination registers. If
instruction is load, we can self-alias it in case when instruction
overrides whole address register. For that we use provided scratch
memory.
|
|
LLVMExegesisRISCV should link against MC and TargetParser as well.
|
|
This patch adds initial vector extension support to RISC-V's exegesis.
The strategy here is to enumerate all RVV _pseudo_ opcodes as their MC
opcode counterparts are kind of useless under this circumstance. We also
enumerate all possible VTYPE operands in each CodeTemplate
configuration. Various of MachineFunction Passes are used for post
processing the snippets, like inserting VSETVLI instructions.
See https://llvm.org/devmtg/2024-10/slides/techtalk/Hsu-RVV-Exegesis.pdf
for more technical details.
|
|
In accordance with https://github.com/llvm/llvm-project/issues/123569
In order to keep the patch at reasonable size, this PR only covers for
the llvm subproject, unittests excluded.
|
|
Some of this was needed to fix implicit conversions from MCRegister to
unsigned when calling getReg() on MCOperand for example.
The majority was done by reviewing parts of the code that dealt with
registers, converting them to MCRegister and then seeing what new
implicit conversions were created and fixing those.
There were a few places where I used MCPhysReg instead of MCRegiser for
static arrays since its uint16_t instead of unsigned.
|
|
(#122986)
Prevents crashes trying to encode pseudo instuctions. Tested on HiFive
Premier P550.
Fixes #122974
|
|
(#122988)
|
|
(#121991)" (#122775)"
This reverts commit a39aaf35d3858a5542f532e399482c2bb0259dac and
63d3bd6d0caf8185aba49540fe2f67512fdf3a98.
Due to test failures on MacOSX.
|
|
(#121991)" (#122775)
This relands f8f8598fd886cddfd374fa43eb6d7d37d301b576
Follow up on #122371:
The problem here is a little subtle: when we dry-run the measurement
phase, we create a LLJIT instance without actually executing the
snippets. The key is, LLJIT has its own TargetMachine which uses triple
designated by LLVM_TARGET_ARCH (which is default to host). On a machine
that does not support Exegesis, the LLJIT would fail to create its
TargetMachine because llvm-exegesis don't even register the host's
target!
Putting this test into any of the target-specific folder won't help,
because it's about the host. And personally I don't really want to use
`exegesis-can-execute-<arch>` for generic tests like this -- it's too
strict as we don't actually need to execute the snippet.
My solution here is creating another test feature which is added only
when LLVM_TARGET_ARCH is supported by llvm-exegesis. This feature is
something in between `<arch>-registered-target` and
`exegesis-can-execute-<arch>`.
|
|
(#122371)
…#121991)"
This reverts commit f8f8598fd886cddfd374fa43eb6d7d37d301b576.
This breaks ARMv7 and s390x buildbot with the following message:
```
llvm-exegesis error: No available targets are compatible with triple "armv8l-unknown-linux-gnueabihf"
FileCheck error: '<stdin>' is empty.
FileCheck command line: /home/tcwg-buildbot/worker/clang-armv7-2stage/stage2/bin/FileCheck /home/tcwg-buildbot/worker/clang-armv7-2stage/llvm/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
```
|