Age | Commit message (Collapse) | Author | Files | Lines |
|
Some of this was needed to fix implicit conversions from MCRegister to
unsigned when calling getReg() on MCOperand for example.
The majority was done by reviewing parts of the code that dealt with
registers, converting them to MCRegister and then seeing what new
implicit conversions were created and fixing those.
There were a few places where I used MCPhysReg instead of MCRegiser for
static arrays since its uint16_t instead of unsigned.
|
|
(#107895)
This patch refactors the procedure of getting the register number from a
register name to LLVMState rather than having individual users get the
values themselves by getting a reference to the map from LLVMState. This
is primarily intended to make some downstream usage in Gematria simpler,
but also cleans up a little bit upstream by pulling the actual map
searching out and just leaving error handling to the clients.
The original getter is left to enable downstream migration in Gematria,
particularly before it gets imported into google internal.
|
|
|
|
(#82256)
…table.
All data is derived from a single table rather than being spread out
over an enum, a table and the main entry point.
This is intended as a replacement for #82092.
|
|
This patch adds a debug option to print per measurement latency and
validation counter values. This makes it easier to debug certain
transient issues that can be hard to spot just using the summary view at
the end.
I've hacked print statements into this part of the code base enough
times for debugging various things that I think it makes sense to be a
proper debug macro.
|
|
This patch adds a branch miss validation counter so that it is easy to
quantify the number of missed branches when using the loop repetition
mode.
|
|
This patch replaces --num-repetitions with --min-instructions to make it
more clear that the value refers to the minimum number of instructions
in the final assembled snippet rather than the number of repetitions of
the snippet. This patch also refactors some llvm-exegesis internal
variable names to reflect the name change.
Fixes #76890.
|
|
This patch adds support for additional types of validation counters and
also adds mappings between these new validation counter types and
physical counters on the hardware for microarchitectures that I have the
ability to test on.
|
|
|
|
This patch adds support for validation counters. Validation counters can
be used to measure events that occur during snippet execution like cache
misses to ensure that certain assumed invariants about the benchmark
actually hold. Validation counters are setup within a perf event group,
so are turned on and off at exactly the same time as the "group leader"
counter that measures the desired value.
|
|
These source files do not use StringMap.h.
|
|
When llvm-exegesis was first introduced, it only supported benchmarking
individual instructions, hence the name for the data structure storing
the data corresponding to a benchmark being called InstructionBenchmark
made sense. However, now that benchmarking arbitrary snippets is
supported, InstructionBenchmark doesn't correspond to a single
instruction. This patch refactors InstructionBenchmark to be called
Benchmark to clean up this little bit of technical debt.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D146884
|
|
https://lab.llvm.org/staging/#/builders/235/builds/1121
|
|
https://lab.llvm.org/staging/#/builders/235/builds/1090
|
|
I'm absolutely flabbergasted by this.
I was absolutely sure this worked.
But apparently not.
When outputting to the file, we'd write a single `InstructionBenchmark`
to it, and then close the file. And for the next `InstructionBenchmark`,
we'd reopen the file, truncating it in process,
and thus the first `InstructionBenchmark` would be gone...
|
|
We know that the original reference from which we've built these maps
will persist, so we do not need to copy the names.
|
|
|
|
runtime)
This reducer runtime of
```
$ ./bin/llvm-exegesis -mode=inverse_throughput --opcode-index=-1 --benchmarks-file=/dev/null --dump-object-to-disk=0 --measurements-print-progress --skip-measurements
```
from 3m44s to 2m17s, aka -38%.
But more importantly, we go from 400 *million* memory allocations
down to just 100 million, aka -75%.
But really, the big missing thing is doing everything in a single thread.
Sure, we can't do anything when measuring, but when we are not measuring,
we should just prepare (codegen) everything via all threads.
That should parallelize quite well.
|
|
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
With Mips fixes.
This reverts commit 7daf60e3440b22b79084bb325d823aa3ad8df0f3.
|
|
Breaks MIPS compile.
This reverts commit cc61c822e05c51e494c50d1e72f963750116ef08.
|
|
We were using the native triple to parse the benchmarks. Use the triple
from the benchmarks file.
Right now this still only allows analyzing files produced by the current
target until D133605 is in.
This also makes the `Analysis` class much less ad-hoc.
Differential Revision: https://reviews.llvm.org/D133697
|
|
These tests access private symbols in the backends, so they cannot link
against libLLVM.so and must be statically linked. Linking these tests
can be slow and with debug builds the resulting binaries use a lot of
disk space.
By merging them into a single test binary means we now only need to
statically link 1 test instead of 6, which helps reduce the build
times and saves disk space.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D106464
|
|
<string> is currently the highest impact header in a clang+llvm build:
https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html
One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps.
This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code.
Differential Revision: https://reviews.llvm.org/D103888
|
|
Add the `IsText` argument to `GetFile` and `GetFileOrSTDIN` which will help z/OS distinguish between text and binary correctly. This is an extension to [this patch](https://reviews.llvm.org/D97785)
Reviewed By: abhina.sreeskantharajan, amccarth
Differential Revision: https://reviews.llvm.org/D100488
|
|
instead of OF_Text
Problem:
On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable.
Solution:
This patch adds two new flags
- OF_CRLF which indicates that CRLF translation is used.
- OF_TextWithCRLF = OF_Text | OF_CRLF indicates that the file is text and uses CRLF translation.
Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF.
So this is the behaviour per platform with my patch:
z/OS:
OF_None: open in binary mode
OF_Text : open in text mode
OF_TextWithCRLF: open in text mode
Windows:
OF_None: open file with no carriage return
OF_Text: open file with no carriage return
OF_TextWithCRLF: open file with carriage return
The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set.
```
if (Flags & OF_CRLF)
CrtOpenFlags |= _O_TEXT;
```
These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows.
./llvm/lib/Support/raw_ostream.cpp
./llvm/lib/TableGen/Main.cpp
./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
./llvm/unittests/Support/Path.cpp
./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
./clang/lib/Frontend/CompilerInstance.cpp
./clang/lib/Driver/Driver.cpp
./clang/lib/Driver/ToolChains/Clang.cpp
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99426
|
|
As mentioned in TODO comment, casting double to float causes NaNs to change bits.
To avoid the change, this patch adds support for single-floating-point immediate value on MachineCode.
Patch by Yuta Saito.
Differential Revision: https://reviews.llvm.org/D77384
|
|
Summary: Second patch: in the lib.
Reviewers: gchatelet
Subscribers: nemanjai, tschuett, MaskRay, mgrang, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68692
llvm-svn: 374158
|
|
Summary: And rename to exegesis::Failure, as it's used everytwhere.
Reviewers: gchatelet
Subscribers: tschuett, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68217
llvm-svn: 373209
|
|
F_{None,Text,Append} are kept for compatibility since r334221.
llvm-svn: 367800
|
|
This *APPEARS* to fix a *very* infuriating issue of Yaml's being corrupted,
partially written, truncated. Or at least i'm not seeing the issue
on a new benchmark sweep.
The issue is somewhat rare, happens maybe once in 1000 benchmarks.
Which means there are up to hundreds of broken benchmarks
for a full x86 sweep in a single mode.
llvm-svn: 360124
|
|
This reverts commit rL358161.
This patch have broken the test:
llvm/test/tools/llvm-exegesis/X86/uops-CMOV16rm-noreg.s
llvm-svn: 358199
|
|
llvm-svn: 358161
|
|
register (PR41448)
Summary:
A *lot* of instructions have this special register.
It seems this never really worked, but i finally noticed it only
because it happened to break for `CMOV16rm` instruction.
We serialized that register as "" (empty string), which is naturally
'ignored' during deserialization, so we re-create a `MCInst` with
too few operands.
And when we then happened to try to resolve variant sched class
for this mis-serialized instruction, and the variant predicate
tried to read an operand that was out of bounds since we got less operands,
we crashed.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=41448 | PR41448 ]].
Reviewers: craig.topper, courbet
Reviewed By: courbet
Subscribers: tschuett, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60517
llvm-svn: 358153
|
|
llvm-svn: 358079
|
|
error strings
llvm-svn: 358077
|
|
Investigating https://bugs.llvm.org/show_bug.cgi?id=41448
llvm-svn: 358076
|
|
(YamlContext::getRegNo())
Summary:
Together with the previous patch, it's an -90% improvement,
or roughly -96% improvement if you look starting with rL347204
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-bew.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-bew.html' (9 runs):
1483.18 msec task-clock # 0.999 CPUs utilized ( +- 0.10% )
68 context-switches # 46.085 M/sec ( +- 22.62% )
0 cpu-migrations # 0.000 K/sec
11641 page-faults # 7850.880 M/sec ( +- 0.62% )
5943246799 cycles # 4008184.428 GHz ( +- 0.10% ) (83.28%)
442869514 stalled-cycles-frontend # 7.45% frontend cycles idle ( +- 0.41% ) (83.29%)
1443375663 stalled-cycles-backend # 24.29% backend cycles idle ( +- 0.47% ) (33.43%)
7714006752 instructions # 1.30 insn per cycle
# 0.19 stalled cycles per insn ( +- 0.07% ) (50.17%)
1977242936 branches # 1333472193.855 M/sec ( +- 0.07% ) (66.79%)
32624220 branch-misses # 1.65% of all branches ( +- 0.18% ) (83.34%)
1.48438 +- 0.00143 seconds time elapsed ( +- 0.10% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-newer.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-newer.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-newer.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-newer.html' (9 runs):
963.28 msec task-clock # 0.999 CPUs utilized ( +- 0.37% )
12 context-switches # 12.695 M/sec ( +- 52.79% )
0 cpu-migrations # 0.000 K/sec
11599 page-faults # 12046.971 M/sec ( +- 0.59% )
3860122322 cycles # 4009359.596 GHz ( +- 0.37% ) (83.19%)
380300669 stalled-cycles-frontend # 9.85% frontend cycles idle ( +- 0.34% ) (83.30%)
1071910340 stalled-cycles-backend # 27.77% backend cycles idle ( +- 1.30% ) (33.51%)
4773418224 instructions # 1.24 insn per cycle
# 0.22 stalled cycles per insn ( +- 0.15% ) (50.17%)
1106990316 branches # 1149787979.919 M/sec ( +- 0.11% ) (66.80%)
23632231 branch-misses # 2.13% of all branches ( +- 0.18% ) (83.33%)
0.96389 +- 0.00356 seconds time elapsed ( +- 0.37% )
```
```
$ sha512sum /tmp/clusters-*
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-bew.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-newer.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-old.html
```
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57658
llvm-svn: 353025
|
|
(YamlContext::getInstrOpcode())
Summary:
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (9 runs):
9465.46 msec task-clock # 1.000 CPUs utilized ( +- 0.05% )
60 context-switches # 6.363 M/sec ( +- 79.45% )
0 cpu-migrations # 0.000 K/sec
11364 page-faults # 1200.697 M/sec ( +- 0.60% )
37935623543 cycles # 4008083.912 GHz ( +- 0.05% ) (83.32%)
2371625356 stalled-cycles-frontend # 6.25% frontend cycles idle ( +- 0.37% ) (83.32%)
8476077875 stalled-cycles-backend # 22.34% backend cycles idle ( +- 0.18% ) (33.36%)
41822439158 instructions # 1.10 insn per cycle
# 0.20 stalled cycles per insn ( +- 0.02% ) (50.03%)
11607658944 branches # 1226405861.486 M/sec ( +- 0.01% ) (66.69%)
210864633 branch-misses # 1.82% of all branches ( +- 0.06% ) (83.34%)
9.46636 +- 0.00441 seconds time elapsed ( +- 0.05% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-bew.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-bew.html' (9 runs):
1480.66 msec task-clock # 1.000 CPUs utilized ( +- 0.19% )
13 context-switches # 8.483 M/sec ( +- 83.10% )
0 cpu-migrations # 0.075 M/sec ( +-100.00% )
11596 page-faults # 7834.247 M/sec ( +- 0.59% )
5933732194 cycles # 4008977.535 GHz ( +- 0.19% ) (83.22%)
438111928 stalled-cycles-frontend # 7.38% frontend cycles idle ( +- 0.37% ) (83.25%)
1454969705 stalled-cycles-backend # 24.52% backend cycles idle ( +- 0.94% ) (33.53%)
7724218604 instructions # 1.30 insn per cycle
# 0.19 stalled cycles per insn ( +- 0.07% ) (50.14%)
1979796413 branches # 1337599858.945 M/sec ( +- 0.06% ) (66.74%)
32641638 branch-misses # 1.65% of all branches ( +- 0.18% ) (83.31%)
1.48128 +- 0.00284 seconds time elapsed ( +- 0.19% )
$ sha512sum /tmp/clusters-*
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-bew.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-old.html
```
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, llvm-commits, RKSimon
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57657
llvm-svn: 353024
|
|
Summary:
... from 8.
`VALIGNDZ128rmbik XMM0 XMM0 K1 XMM3 RDI i_0x1 i_0x0 i_0x1` instruction already has 9 components.
It does not matter much in terms of performance, but avoiding allocation seems to come with low cost here..
Reviewers: courbet, gchatelet
Reviewed By: courbet
Subscribers: tschuett, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57654
llvm-svn: 353022
|
|
Summary:
This just uses the latency benchmark runner on the parallel uops snippet
generator.
Fixes PR37698.
Reviewers: gchatelet
Subscribers: tschuett, RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D57000
llvm-svn: 352632
|
|
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
|
|
Summary: Fixes PR39097.
Reviewers: gchatelet
Subscribers: llvm-commits, tschuett
Differential Revision: https://reviews.llvm.org/D54151
llvm-svn: 346328
|
|
Summary:
This allows simplifying references of llvm::foo with foo when the needs
come in the future.
Reviewers: courbet, gchatelet
Reviewed By: gchatelet
Subscribers: javed.absar, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D53455
llvm-svn: 344922
|
|
Summary: sscanf turns out to be slow for reading floating points.
Reviewers: courbet
Subscribers: tschuett, llvm-commits, RKSimon
Differential Revision: https://reviews.llvm.org/D52866
llvm-svn: 343771
|
|
non-matches (PR39102)
deserializeMCOperand - ensure that we at least match the first character of the sscanf pattern before calling
This reduces llvm-exegesis uops analysis of the instructions supported from btver2 from 5m13s to 2m1s on debug builds.
llvm-svn: 343690
|
|
Summary:
THis is a backwards-compatible change (existing files will work as
expected).
See PR39082.
Reviewers: gchatelet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52546
llvm-svn: 343108
|
|
Summary: See PR38936 for context.
Reviewers: gchatelet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52500
llvm-svn: 343081
|
|
Summary: Adds the registers initial values to the YAML output of llvm-exegesis.
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52460
llvm-svn: 342982
|
|
Reviewers: courbet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D52496
llvm-svn: 342981
|