aboutsummaryrefslogtreecommitdiff
path: root/llvm/docs/TestSuiteGuide.md
blob: 77c6d670b042eecf7b803ab8f11916b504721e6f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
test-suite Guide
================

Quickstart
----------

1. The lit test runner is required to run the tests. You can either use one
   from an LLVM build:

   ```bash
   % <path to llvm build>/bin/llvm-lit --version
   lit 20.0.0dev
   ```

   An alternative is installing it as a Python package in a Python virtual
   environment:

   ```bash
   % python3 -m venv .venv
   % . .venv/bin/activate
   % pip install git+https://github.com/llvm/llvm-project.git#subdirectory=llvm/utils/lit
   % lit --version
   lit 20.0.0dev
   ```

   Installing the official Python release of lit in a Python virtual
   environment could also work. This will install the most recent 
   release of lit:

   ```bash
   % python3 -m venv .venv
   % . .venv/bin/activate
   % pip install lit
   % lit --version
   lit 18.1.8
   ```

   Please note that recent tests may rely on features not in the latest released lit. 
   If in doubt, try one of the previous methods.

2. Check out the `test-suite` module with:

   ```bash
   % git clone https://github.com/llvm/llvm-test-suite.git
   ```

3. Create a build directory and use CMake to configure the suite. Use the
   `CMAKE_C_COMPILER` option to specify the compiler to test (the C++ compiler
   will be inferred automatically from this). Use a cache file to choose a typical
   build configuration:

   ```bash
   % mkdir test-suite-build
   % cd test-suite-build
   % cmake -DCMAKE_C_COMPILER=<path to llvm build>/bin/clang \
           -C../llvm-test-suite/cmake/caches/O3.cmake \
           ../llvm-test-suite
   ```

**NOTE!** if you are using your built clang, and you want to build and run the
MicroBenchmarks/XRay microbenchmarks, you need to add `compiler-rt` to your
`LLVM_ENABLE_RUNTIMES` cmake flag.

4. Build the benchmarks:

   ```text
   % make
   Scanning dependencies of target timeit-target
   [  0%] Building C object tools/CMakeFiles/timeit-target.dir/timeit.c.o
   [  0%] Linking C executable timeit-target
   ...
   ```

5. Run the tests with lit:

   ```text
   % llvm-lit -v -j 1 -o results.json .
   -- Testing: 474 tests, 1 threads --
   PASS: test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test (1 of 474)
   ********** TEST 'test-suite :: MultiSource/Applications/ALAC/decode/alacconvert-decode.test' RESULTS **********
   compile_time: 0.2192
   exec_time: 0.0462
   hash: "59620e187c6ac38b36382685ccd2b63b"
   size: 83348
   **********
   PASS: test-suite :: MultiSource/Applications/ALAC/encode/alacconvert-encode.test (2 of 474)
   ...
   ```

```{note}
  Even when you only want compile-time results you still need to run the test
  with the above `llvm-lit` command. In this case, the `results.json` file will
  contain compile time metrics only (code size, llvm stats and so on).

  This mode is enabled by settting `-DTEST_SUITE_RUN_BENCHMARKS=OFF`,
  more details [here](common_configuration_options).
```

6. Show and compare result files (optional):

   ```bash
   # Make sure pandas and scipy are installed. Prepend `sudo` if necessary.
   % pip install pandas scipy
   # Show a single result file:
   % llvm-test-suite/utils/compare.py results.json
   # Compare two result files:
   % llvm-test-suite/utils/compare.py results_a.json results_b.json
   ```


Structure
---------

The test-suite contains benchmark and test programs.  The programs come with
reference outputs so that their correctness can be checked.  The suite comes
with tools to collect metrics such as benchmark runtime, compilation time and
code size.

The test-suite is divided into several directories:

-  `SingleSource/`

   Contains test programs that are only a single source file in size.  A
   subdirectory may contain several programs.

-  `MultiSource/`

   Contains subdirectories which entire programs with multiple source files.
   Large benchmarks and whole applications go here.

-  `MicroBenchmarks/`

   Programs using the [google-benchmark](https://github.com/google/benchmark)
   library. The programs define functions that are run multiple times until the
   measurement results are statistically significant.

-  `External/`

   Contains descriptions and test data for code that cannot be directly
   distributed with the test-suite. The most prominent members of this
   directory are the SPEC CPU benchmark suites.
   See [External Suites](#external-suites).

-  `Bitcode/`

   These tests are mostly written in LLVM bitcode.

-  `CTMark/`

   Contains symbolic links to other benchmarks forming a representative sample
   for compilation performance measurements.

### Benchmarks

Every program can work as a correctness test. Some programs are unsuitable for
performance measurements. Setting the `TEST_SUITE_BENCHMARKING_ONLY` CMake
option to `ON` will disable them.

The MultiSource benchmarks consist of the following apps and benchmarks:

| MultiSource          | Language  | Application Area              | Remark               |
|----------------------|-----------|-------------------------------|----------------------|
| 7zip                 |  C/C++    | Compression/Decompression     |                      |
| ASCI_Purple          |  C        | SMG2000 benchmark and solver  | Memory intensive app |
| ASC_Sequoia          |  C        | Simulation and solver         |                      |
| BitBench             |  C        | uudecode/uuencode utility     | Bit Stream benchmark for functional compilers |
| Bullet               |  C++      | Bullet 2.75 physics engine    |                      |
| DOE-ProxyApps-C++    |  C++      | HPC/scientific apps           | Small applications, representative of our larger DOE workloads |
| DOE-ProxyApps-C      |  C        | HPC/scientific apps           | "                    |
| Fhourstones          |  C        | Game/solver                   | Integer benchmark that efficiently solves positions in the game of Connect-4 |
| Fhourstones-3.1      |  C        | Game/solver                   | "                    |
| FreeBench            |  C        | Benchmark suite               | Raytracer, four in a row, neural network, file compressor, Fast Fourier/Cosine/Sine Transform |
| llubenchmark         |  C        | Linked-list micro-benchmark   |                      |
| mafft                |  C        | Bioinformatics                | A multiple sequence alignment program |
| MallocBench          |  C        | Benchmark suite               | cfrac, espresso, gawk, gs, make, p2c, perl |
| McCat                |  C        | Benchmark suite               | Quicksort, bubblesort, eigenvalues |
| mediabench           |  C        | Benchmark suite               | adpcm, g721, gsm, jpeg, mpeg2 |
| MiBench              |  C        | Embedded benchmark suite      | Automotive, consumer, office, security, telecom apps  |
| nbench               |  C        |                               | BYTE Magazine's BYTEmark benchmark program |
| NPB-serial           |  C        | Parallel computing            | Serial version of the NPB IS code |
| Olden                |  C        | Data Structures               | SGI version of the Olden benchmark |
| OptimizerEval        |  C        | Solver                        | Preston Brigg's optimizer evaluation framework |
| PAQ8p                |  C++      | Data compression              |                      |
| Prolangs-C++         |  C++      | Benchmark suite               | city, employ, life, NP, ocean, primes, simul, vcirc |
| Prolangs-C           |  C        | Benchmark suite               | agrep, archie-client, bison, gnugo, unix-smail |
| Ptrdist              |  C        | Pointer-Intensive Benchmark Suite |                  |
| Rodinia              |  C        | Scientific apps              | backprop, pathfinder, srad |
| SciMark2-C           |  C        | Scientific apps              | FFT, LU, Montecarlo, sparse matmul |
| sim                  |  C        | Dynamic programming          | A Time-Efficient, Linear-Space Local Similarity Algorithm |
| tramp3d-v4           |  C++      | Numerical analysis           | Template-intensive numerical program based on FreePOOMA |
| Trimaran             |  C        | Encryption                   | 3des, md5, crc |
| TSVC                 |  C        | Vectorization benchmark      | Test Suite for Vectorizing Compilers (TSVC) |
| VersaBench           |  C        | Benchmark suite              | 8b10b, beamformer, bmm, dbms, ecbdes |

All MultiSource applications are suitable for performance measurements
and will run when CMake option `TEST_SUITE_BENCHMARKING_ONLY` is set.

Configuration
-------------

The test-suite has configuration options to customize building and running the
benchmarks. CMake can print a list of them:

```bash
% cd test-suite-build
# Print basic options:
% cmake -LH
# Print all options:
% cmake -LAH
```

(common_configuration_options)=
### Common Configuration Options

- `CMAKE_C_FLAGS`

  Specify extra flags to be passed to C compiler invocations.  The flags are
  also passed to the C++ compiler and linker invocations.  See
  [https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html)

- `CMAKE_C_COMPILER`

  Select the C compiler executable to be used. Note that the C++ compiler is
  inferred automatically i.e. when specifying `path/to/clang` CMake will
  automatically use `path/to/clang++` as the C++ compiler.  See
  [https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_COMPILER.html](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_COMPILER.html)

- `CMAKE_Fortran_COMPILER`

  Select the Fortran compiler executable to be used. Not set by default and not
  required unless running the Fortran Test Suite.

- `CMAKE_BUILD_TYPE`

  Select a build type like `OPTIMIZE` or `DEBUG` selecting a set of predefined
  compiler flags. These flags are applied regardless of the `CMAKE_C_FLAGS`
  option and may be changed by modifying `CMAKE_C_FLAGS_OPTIMIZE` etc.  See
  [https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html](https://cmake.org/cmake/help/latest/variable/CMAKE_BUILD_TYPE.html)

- `TEST_SUITE_FORTRAN`

  Activate that Fortran tests. This is a work in progress. More information can be
  found in the [Flang documentation](https://flang.llvm.org/docs/FortranLLVMTestSuite.html)

- `TEST_SUITE_RUN_UNDER`

  Prefix test invocations with the given tool. This is typically used to run
  cross-compiled tests within a simulator tool.

- `TEST_SUITE_BENCHMARKING_ONLY`

  Disable tests that are unsuitable for performance measurements. The disabled
  tests either run for a very short time or are dominated by I/O performance
  making them unsuitable as compiler performance tests.

- `TEST_SUITE_SUBDIRS`

  Semicolon-separated list of directories to include. This can be used to only
  build parts of the test-suite or to include external suites.  This option
  does not work reliably with deeper subdirectories as it skips intermediate
  `CMakeLists.txt` files which may be required.

- `TEST_SUITE_COLLECT_STATS`

  Collect internal LLVM statistics. Appends `-save-stats=obj` when invoking the
  compiler and makes the lit runner collect and merge the statistic files.

- `TEST_SUITE_RUN_BENCHMARKS`

  If this is set to `OFF` then lit will not actually run the tests but just
  collect build statistics like compile time and code size.

- `TEST_SUITE_USE_PERF`

  Use the `perf` tool for time measurement instead of the `timeit` tool that
  comes with the test-suite.  The `perf` is usually available on linux systems.

- `TEST_SUITE_SPEC2000_ROOT`, `TEST_SUITE_SPEC2006_ROOT`, `TEST_SUITE_SPEC2017_ROOT`, ...

  Specify installation directories of external benchmark suites. You can find
  more information about expected versions or usage in the README files in the
  `External` directory (such as `External/SPEC/README`)

### Common CMake Flags

- `-GNinja`

  Generate build files for the ninja build tool.

- `-Cllvm-test-suite/cmake/caches/<cachefile.cmake>`

  Use a CMake cache.  The test-suite comes with several CMake caches which
  predefine common or tricky build configurations.


Displaying and Analyzing Results
--------------------------------

The `compare.py` script displays and compares result files.  A result file is
produced when invoking lit with the `-o filename.json` flag.

Example usage:

- Basic Usage:

  ```text
  % llvm-test-suite/utils/compare.py baseline.json
  Warning: 'test-suite :: External/SPEC/CINT2006/403.gcc/403.gcc.test' has No metrics!
  Tests: 508
  Metric: exec_time

  Program                                         baseline

  INT2006/456.hmmer/456.hmmer                   1222.90
  INT2006/464.h264ref/464.h264ref               928.70
  ...
               baseline
  count  506.000000
  mean   20.563098
  std    111.423325
  min    0.003400
  25%    0.011200
  50%    0.339450
  75%    4.067200
  max    1222.896800
  ```

- Show compile_time or text segment size metrics:

  ```bash
  % llvm-test-suite/utils/compare.py -m compile_time baseline.json
  % llvm-test-suite/utils/compare.py -m size.__text baseline.json
  ```

- Compare two result files and filter short running tests:

  ```bash
  % llvm-test-suite/utils/compare.py --filter-short baseline.json experiment.json
  ...
  Program                                         baseline  experiment  diff

  SingleSour.../Benchmarks/Linpack/linpack-pc     5.16      4.30        -16.5%
  MultiSourc...erolling-dbl/LoopRerolling-dbl     7.01      7.86         12.2%
  SingleSour...UnitTests/Vectorizer/gcc-loops     3.89      3.54        -9.0%
  ...
  ```

- Merge multiple baseline and experiment result files by taking the minimum
  runtime each:

  ```bash
  % llvm-test-suite/utils/compare.py base0.json base1.json base2.json vs exp0.json exp1.json exp2.json
  ```

### Continuous Tracking with LNT

LNT is a set of client and server tools for continuously monitoring
performance. You can find more information at
[https://llvm.org/docs/lnt](https://llvm.org/docs/lnt). The official LNT instance
of the LLVM project is hosted at [http://lnt.llvm.org](http://lnt.llvm.org).


External Suites
---------------

External suites such as SPEC can be enabled by either

- placing (or linking) them into the `llvm-test-suite/test-suite-externals/xxx` directory (example: `llvm-test-suite/test-suite-externals/speccpu2000`)
- using a configuration option such as `-D TEST_SUITE_SPEC2000_ROOT=path/to/speccpu2000`

You can find further information in the respective README files such as
`llvm-test-suite/External/SPEC/README`.

For the SPEC benchmarks you can switch between the `test`, `train` and
`ref` input datasets via the `TEST_SUITE_RUN_TYPE` configuration option.
The `train` dataset is used by default.

In addition to SPEC, the multimedia frameworks ffmpeg and dav1d can also
be hooked up as external projects in the same way. By including them in
llvm-test-suite, a lot more of potentially vectorizable code gets compiled -
which can catch compiler bugs merely by triggering code generation asserts.
Including them also adds small code correctness tests, that compare the
output of the compiler generated functions against handwritten assembly
functions. (On x86, building the assembly requires having the nasm tool
available.) The integration into llvm-test-suite doesn't run the projects'
full testsuites though. The projects also contain microbenchmarks for
measuring the performance of some functions. See the `README.md` files in
the respective `ffmpeg` and `dav1d` directories under
`llvm-test-suite/External` for further details.


Custom Suites
-------------

You can build custom suites using the test-suite infrastructure. A custom suite
has a `CMakeLists.txt` file at the top directory. The `CMakeLists.txt` will be
picked up automatically if placed into a subdirectory of the test-suite or when
setting the `TEST_SUITE_SUBDIRS` variable:

```bash
% cmake -DTEST_SUITE_SUBDIRS=path/to/my/benchmark-suite ../llvm-test-suite
```


Profile Guided Optimization
---------------------------

Profile guided optimization requires to compile and run twice. First the
benchmark should be compiled with profile generation instrumentation enabled
and setup for training data. The lit runner will merge the profile files
using `llvm-profdata` so they can be used by the second compilation run.

Example:
```bash
# Profile generation run using LLVM IR PGO:
% cmake -DTEST_SUITE_PROFILE_GENERATE=ON \
        -DTEST_SUITE_USE_IR_PGO=ON \
        -DTEST_SUITE_RUN_TYPE=train \
        ../llvm-test-suite
% make
% llvm-lit .
# Use the profile data for compilation and actual benchmark run:
% cmake -DTEST_SUITE_PROFILE_GENERATE=OFF \
        -DTEST_SUITE_PROFILE_USE=ON \
        -DTEST_SUITE_RUN_TYPE=ref \
        .
% make
% llvm-lit -o result.json .
```

To use Clang frontend's PGO instead of LLVM IR PGO, set `-DTEST_SUITE_USE_IR_PGO=OFF`.

The `TEST_SUITE_RUN_TYPE` setting only affects the SPEC benchmark suites.


Cross Compilation and External Devices
--------------------------------------

### Compilation

CMake allows to cross compile to a different target via toolchain files. More
information can be found here:

- [https://llvm.org/docs/lnt/tests.html#cross-compiling](https://llvm.org/docs/lnt/tests.html#cross-compiling)

- [https://cmake.org/cmake/help/latest/manual/cmake-toolchains.7.html](https://cmake.org/cmake/help/latest/manual/cmake-toolchains.7.html)

Cross compilation from macOS to iOS is possible with the
`llvm-test-suite/cmake/caches/target-target-*-iphoneos-internal.cmake` CMake cache
files; this requires an internal iOS SDK.

### Running

There are two ways to run the tests in a cross compilation setting:

- Via SSH connection to an external device: The `TEST_SUITE_REMOTE_HOST` option
  should be set to the SSH hostname.  The executables and data files need to be
  transferred to the device after compilation.  This is typically done via the
  `rsync` make target.  After this, the lit runner can be used on the host
  machine. It will prefix the benchmark and verification command lines with an
  `ssh` command.

  Example:

  ```bash
  % cmake -G Ninja -D CMAKE_C_COMPILER=path/to/clang \
          -C ../llvm-test-suite/cmake/caches/target-arm64-iphoneos-internal.cmake \
          -D CMAKE_BUILD_TYPE=Release \
          -D TEST_SUITE_REMOTE_HOST=mydevice \
          ../llvm-test-suite
  % ninja
  % ninja rsync
  % llvm-lit -j1 -o result.json .
  ```

- You can specify a simulator for the target machine with the
  `TEST_SUITE_RUN_UNDER` setting. The lit runner will prefix all benchmark
  invocations with it.


Running the test-suite via LNT
------------------------------

The LNT tool can run the test-suite. Use this when submitting test results to
an LNT instance.  See
[https://llvm.org/docs/lnt/tests.html#llvm-cmake-test-suite](https://llvm.org/docs/lnt/tests.html#llvm-cmake-test-suite)
for details.

Running the test-suite via Makefiles (deprecated)
-------------------------------------------------

**Note**: The test-suite comes with a set of Makefiles that are considered
deprecated.  They do not support newer testing modes like `Bitcode` or
`Microbenchmarks` and are harder to use.

Old documentation is available in the
[test-suite Makefile Guide](TestSuiteMakefileGuide).