aboutsummaryrefslogtreecommitdiff
path: root/benchtests
AgeCommit message (Collapse)AuthorFilesLines
2013-09-19Add benchmark inputs for sincosSiddhesh Poyarekar2-1/+89
2013-09-11benchtests: Rename argument to TIMING_INIT macro.Will Newton2-10/+10
The TIMING_INIT macro currently sets the number of loop iterations to 1000, which limits usefulness. Make the argument a clock resolution value and multiply by 1000 in bench-skeleton.c instead to allow easier reuse. ChangeLog: 2013-09-11 Will Newton <will.newton@linaro.org> * benchtests/bench-timing.h (TIMING_INIT): Rename ITERS parameter to RES. Remove hardcoded 1000 value. * benchtests/bench-skeleton.c (main): Pass RES parameter to TIMING_INIT and multiply result by 1000.
2013-09-06benchtests: Add memrchr benchmarkAdhemerval Zanella4-4/+66
2013-09-06benchtests/Makefile: Run benchmark for memcpy.Will Newton1-5/+5
The benchmark for memcpy got disabled accidentally. Re-enable it. ChangeLog: 2013-09-06 Will Newton <will.newton@linaro.org> * benchtests/Makefile (string-bench): Add memcpy.
2013-09-04benchtests: Switch string benchmarks to use bench-timing.h.Will Newton27-502/+334
Switch the string benchmarks to using bench-timing.h instead of hp-timing.h directly. This allows the string benchmarks to be run usefully on architectures such as ARM that do not have support for hp-timing.h. In order to do this the tests have been changed from timing each individual call and picking the lowest execution time recorded to timing a number of calls and taking the mean execution time. ChangeLog: 2013-09-04 Will Newton <will.newton@linaro.org> * benchtests/bench-timing.h (TIMING_PRINT_MEAN): New macro. * benchtests/bench-string.h: Include bench-timing.h instead of including hp-timing.h directly. (INNER_LOOP_ITERS): New define. (HP_TIMING_BEST): Delete macro. (test_init): Remove call to HP_TIMING_DIFF_INIT. * benchtests/bench-memccpy.c: Use bench-timing.h macros instead of hp-timing.h macros. * benchtests/bench-memchr.c: Likewise. * benchtests/bench-memcmp.c: Likewise. * benchtests/bench-memcpy.c: Likewise. * benchtests/bench-memmem.c: Likewise. * benchtests/bench-memmove.c: Likewise. * benchtests/bench-memset.c: Likewise. * benchtests/bench-rawmemchr.c: Likewise. * benchtests/bench-strcasecmp.c: Likewise. * benchtests/bench-strcasestr.c: Likewise. * benchtests/bench-strcat.c: Likewise. * benchtests/bench-strchr.c: Likewise. * benchtests/bench-strcmp.c: Likewise. * benchtests/bench-strcpy.c: Likewise. * benchtests/bench-strcpy_chk.c: Likewise. * benchtests/bench-strlen.c: Likewise. * benchtests/bench-strncasecmp.c: Likewise. * benchtests/bench-strncat.c: Likewise. * benchtests/bench-strncmp.c: Likewise. * benchtests/bench-strncpy.c: Likewise. * benchtests/bench-strnlen.c: Likewise. * benchtests/bench-strpbrk.c: Likewise. * benchtests/bench-strrchr.c: Likewise. * benchtests/bench-strspn.c: Likewise. * benchtests/bench-strstr.c: Likewise.
2013-09-04benchtests/Makefile: Use LDLIBS instead of LDFLAGS.Will Newton1-16/+16
LDFLAGS puts the library too early in the command line if --as-needed is being used. Use LDLIBS instead. ChangeLog: 2013-09-04 Will Newton <will.newton@linaro.org> * benchtests/Makefile: Use LDLIBS instead of LDFLAGS.
2013-06-20Fix loop construction to functions callsAdhemerval Zanella2-0/+2
Check wheter the compiler has the option -fno-tree-loop-distribute-patterns to inhibit loop transformation to library calls and uses it on memset and memmove default implementation to avoid recursive calls.
2013-06-11Port remaining string benchmarksSiddhesh Poyarekar5-1/+349
There were a few more string benchmarks (strcpy_chk and stpcpy_check) in the debug directory that needed to be ported over.
2013-06-11Copy over string performance tests into benchtestsSiddhesh Poyarekar61-1/+4991
Copy over already existing string performance tests into benchtests. Bits not related to performance measurements have been omitted.
2013-06-11Begin porting string performance tests to benchtestsSiddhesh Poyarekar5-3/+434
This is the initial support for string function performance tests, along with copying tests for memcpy and memcpy-ifunc as proof of concept. The string function benchmarks perform operations at different alignments and for different sizes and compare performance between plain operations and the optimized string operations. Due to this their output is incompatible with the function benchmarks where we're interested in fastest time, throughput, etc. In future, the correctness checks in the benchmark tests can be removed. Same goes for the performance measurements in the string/test-*.
2013-06-10Avoid overwriting earlier flags in CPPFLAGS-nonlib in benchtestsSiddhesh Poyarekar1-1/+1
When setting BENCH_DURATION in CPPFLAGS-nonlib, append to the variable instead of assigning to it, to avoid overwriting earlier set flags, notably the -DNOT_IN_libc=1 flag.
2013-05-22Sort benchmark functionsSiddhesh Poyarekar1-41/+42
2013-05-22Add benchmark inputs for math functionsSiddhesh Poyarekar10-1/+83
Add benchmark inputs for inverse and hyperbolic trigonometric functions and log.
2013-05-21Add a README for benchtestsSiddhesh Poyarekar2-20/+74
Move instructions from the Makefile here and expand on them.
2013-05-17Prevent optimizing out of benchmark function callSiddhesh Poyarekar1-1/+1
Resolves: #15424 The compiler would optimize the benchmark function call out of the loop and call it only once, resulting in blazingly fast times for some benchmarks (notably atan, sin and cos). Mark the inputs as volatile so that the code is forced to read again from the input for each iteration.
2013-05-13Use HP_TIMING for benchmarks if availableSiddhesh Poyarekar3-22/+93
HP_TIMING uses native timestamping instructions if available, thus greatly reducing the overhead of recording start and end times for function calls. For architectures that don't have HP_TIMING available, we fall back to the clock_gettime bits. One may also override this by invoking the benchmark as follows: make USE_CLOCK_GETTIME=1 bench and get the benchmark results using clock_gettime. One has to do `make bench-clean` to ensure that the benchmark programs are rebuilt.
2013-05-10Fix coding styleSiddhesh Poyarekar1-4/+4
2013-05-08Preheat CPU in benchtests.Ondrej Bilka1-0/+17
A benchmark could be skewed by CPU initialy working on minimal frequency and speeding up later. We first run code in loop to partialy fix this issue.
2013-04-30Allow multiple input domains to be run in the same benchmark programSiddhesh Poyarekar21-217/+87
Some math functions have distinct performance characteristics in specific domains of inputs, where some inputs return via a fast path while other inputs require multiple precision calculations, that too at different precision levels. The way to implement different domains was to have a separate source file and benchmark definition, resulting in separate programs. This clutters up the benchmark, so this change allows these domains to be consolidated into the same input file. To do this, the input file format is now enhanced to allow comments with a preceding # and directives with two # at the begining of a line. A directive that looks like: tells the benchmark generation script that what follows is a different domain of inputs. The value of the 'name' directive (in this case, foo) is used in the output. The two input domains are then executed sequentially and their results collated separately. with the above directive, there would be two lines in the result that look like: func(): .... func(foo): ...
2013-04-30Maintain runtime of each benchmark at ~10 secondsSiddhesh Poyarekar3-29/+38
The idea to run benchmarks for a constant number of iterations is problematic. While the benchmarks may run for 10 seconds on x86_64, they could run for about 30 seconds on powerpc and worse, over 3 minutes on arm. Besides that, adding a new benchmark is cumbersome since one needs to find out the number of iterations needed for a sufficient runtime. A better idea would be to run each benchmark for a specific amount of time. This patch does just that. The run time defaults to 10 seconds and it is configurable at command line: make BENCH_DURATION=5 bench
2013-04-24Mention files in which fast/slow paths of math functions are implementedSiddhesh Poyarekar1-12/+12
2013-04-23PowerPC: modf optimizationAdhemerval Zanella2-1/+40
This patch implements modf/modff optimization for POWER by focus on FP operations instead of relying in integer ones.
2013-04-17Add benchmark inputs for cos and tanSiddhesh Poyarekar7-1/+78
2013-04-16Define NOT_IN_libc when compiling benchmark programsSiddhesh Poyarekar1-0/+6
2013-04-16Add target bench-cleanSiddhesh Poyarekar1-0/+3
2013-04-15Write to bench.out-tmp only onceSiddhesh Poyarekar1-4/+4
Appending benchmark program output on every run could result in a case where the benchmark run was cancelled, resulting in a partially written file. This file gets used again on the next run, resulting in results being appended to old results. It could have been possible to remove the file before every benchmark run, but it is easier to just write the output to bench.out-tmp only once.
2013-04-15Rebuild benchmark sources when Makefile is updatedSiddhesh Poyarekar1-1/+3
Benchmark programs are generated using parameters from the Makefile, so it is necessary to rebuild them whenever the parameters in the Makefile are updated. Hence, added a dependency for the generated C source on the Makefile so that it gets regenerated when the Makefile is updated.
2013-04-12Move bench target to benchtestsSiddhesh Poyarekar1-0/+34
The bench target will only be used within the benchtests directory.
2013-04-03Add benchmark inputs for atanSiddhesh Poyarekar4-1/+39
Add separate inputs for slow and fast paths of atan
2013-04-02Add benchmark inputs for sinSiddhesh Poyarekar4-1/+47
2013-04-02Add benchmark tests for slowpow and slowexpSiddhesh Poyarekar7-6/+64
Separate benchmarks for the fast and slow implementations of pow and exp since measuring both together doesn't make sense. Adjust the iterations for pow and exp accordingly so that they run long enough for the measurements to be meaningful.
2013-04-01PowerPC: remove branch prediction from rint implementationAdhemerval Zanella2-1/+10
The branch prediction hints is actually hurts performance in this case. The assembly implementation make two assumptions: 1. 'fabs (x) < 2^52' is unlikely and 2. 'x > 0.0' is unlike (if 1. is true). Since it a general floating point function, expected input is not bounded and then it is better to let the hardware handle the branches.
2013-03-15Framework for performance benchmarking of functionsSiddhesh Poyarekar4-0/+136
See benchtests/Makefile to know how to use it.