aboutsummaryrefslogtreecommitdiff
path: root/libc/benchmarks/gpu/LibcGpuBenchmark.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-07-18[libc] Fix GPU benchmarkingJoseph Huber1-1/+1
2024-08-18[libc][gpu] Add Atan2 Benchmarks (#104708)jameshu158691-2/+2
This PR adds benchmarking for `atan2()`, `__nv_atan2()`, and `__ocml_atan2_f64()` using the same setup as `sin()`. This PR also adds support for throughout bencmarking for functions with 2 inputs.
2024-08-08[libc] [gpu] Fix Minor Benchmark UI Issues (#102529)jameshu158691-5/+7
Previously, `AmdgpuSinTwoPow_128` and others were too large for their table cells. This PR shortens the name to `AmdSin...` There were also some `-` missing in the separator. This PR instead creates the separator string using the length of the headers.
2024-08-05[libc] [gpu] Change Time To Be Per Iteration (#101919)jameshu158691-5/+5
Previously, the time field was the total time take to run all iterations of the benchmark. This PR changes the value displayed to be the average time take by each iteration.
2024-07-29[libc] Add Generic and NVPTX Sin Benchmark (#99795)jameshu158691-1/+4
This PR adds sin benchmarking for a range of values and on a pregenerated random distribution.
2024-07-26[libc] Add Minimum Time and Iterations, Reduce Epsilon (#100838)jameshu158691-0/+1
This PR adds minimums (50 iterations, 500 us, and epsilon of 0.0001) to ensure that all benchmarks run at least a set number of times before outputting a final measurement.
2024-07-22[libc] Fix invalid format specifier in benchmarkJoseph Huber1-16/+11
Summary: This value is a uint32_t but is printed as a uint64_t, leading to invalid offsets when done on AMDGPU due to its packed format extending past the buffer.
2024-07-21[libc] Add N Threads Benchmark Helper (#99834)jameshu158691-4/+1
This PR adds a `BENCHMARK_N_THREADS()` helper to register benchmarks with a specific number of threads. This PR replaces the flags used originally to allow any amount of threads.
2024-07-21[libc] Improve Benchmark UI (#99796)jameshu158691-10/+47
This PR changes the output to resemble Google Benchmark. e.g. ``` Running Suite: LlvmLibcIsAlNumGpuBenchmark Benchmark | Cycles | Min | Max | Iterations | Time (ns) | Stddev | Threads | ----------------------------------------------------------------------------------------------------- IsAlnum | 92 | 76 | 482 | 23 | 86500 | 76 | 64 | IsAlnumSingleThread | 87 | 76 | 302 | 20 | 72000 | 49 | 1 | IsAlnumSingleWave | 87 | 76 | 302 | 20 | 72000 | 49 | 32 | IsAlnumCapital | 89 | 76 | 299 | 17 | 78500 | 52 | 64 | IsAlnumNotAlnum | 87 | 76 | 303 | 20 | 76000 | 49 | 64 | ```
2024-07-18[libc] Add Multithreaded GPU Benchmarks (#98964)jameshu158691-2/+7
This PR runs benchmarks on a 32 threads (A single warp on NVPTX) by default, adding the option for single threaded benchmarks. We can specify that a benchmark should be run on a single thread using the `SINGLE_THREADED_BENCHMARK()` macro. I chose to use a flag here so that other options could be added in the future.
2024-07-15[libc] Use Atomics in GPU Benchmarks (#98842)jameshu158691-41/+94
This PR replaces our old method of reducing the benchmark results by using an array to using atomics instead. This should help us implement single threaded benchmarks.
2024-07-12[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98597)Petr Hosek1-2/+3
This is a part of #97655.
2024-07-12Revert "[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace ↵Mehdi Amini1-3/+2
declaration" (#98593) Reverts llvm/llvm-project#98075 bots are broken
2024-07-11[libc] Migrate to using LIBC_NAMESPACE_DECL for namespace declaration (#98075)Petr Hosek1-2/+3
This is a part of #97655.
2024-07-11[libc] Correctly Run Multiple Benchmarks in the Same File (#98467)jameshu158691-4/+3
There was previously an issue where registering multiple benchmarks in the same file would only give the results for the last benchmark to run. This PR fixes the issue. @jhuber6
2024-07-06[libc] Fix Cppcheck Issues (#96999)jameshu158691-13/+12
This PR fixes linting issues discovered by `cppcheck`. Fixes: https://github.com/llvm/llvm-project/issues/96863
2024-06-26[libc] NVPTX Profiling (#92009)jameshu158691-0/+140
PR for adding microbenchmarking infrastructure for NVPTX. `nvlink` cannot perform LTO, so we cannot inline `libc` functions and this function call overhead is not adjusted for during microbenchmarking.