aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenFunction.cpp
diff options
context:
space:
mode:
authorPankaj Dwivedi <pankajkumar.divedi@amd.com>2025-10-09 12:44:56 +0530
committerGitHub <noreply@github.com>2025-10-09 12:44:56 +0530
commit53aad35208d00c8382b62b1d23005938aea77469 (patch)
tree0aecd6ac6ac2a8027b0a823725dcdefd35ba7b1c /clang/lib/CodeGen/CodeGenFunction.cpp
parentb32710a56b4742ec393a2a5df97346e9668e9887 (diff)
downloadllvm-53aad35208d00c8382b62b1d23005938aea77469.zip
llvm-53aad35208d00c8382b62b1d23005938aea77469.tar.gz
llvm-53aad35208d00c8382b62b1d23005938aea77469.tar.bz2
[AMDGPU] Introduce "amdgpu-uniform-intrinsic-combine" pass to combine uniform AMDGPU lane Intrinsics. (#116953)
This pass introduces optimizations for AMDGPU intrinsics by leveraging the uniformity of their arguments. When an intrinsic's arguments are detected as uniform, redundant computations are eliminated, and the intrinsic calls are simplified accordingly. By utilizing the UniformityInfo analysis, this pass identifies cases where intrinsic calls are uniform across all lanes, allowing transformations that reduce unnecessary operations and improve the IR's efficiency. These changes enhance performance by streamlining intrinsic usage in uniform scenarios without altering the program's semantics. For background, see PR #99878
Diffstat (limited to 'clang/lib/CodeGen/CodeGenFunction.cpp')
0 files changed, 0 insertions, 0 deletions