aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Frontend/CompilerInvocation.cpp
diff options
context:
space:
mode:
authorPrinceton Ferro <pferro@nvidia.com>2025-08-06 21:45:21 -0700
committerGitHub <noreply@github.com>2025-08-06 21:45:21 -0700
commit9a592d9a849dacf02ff571c81f2b3a805e9d13e5 (patch)
tree358b20e399465bf7888b4471ad2b73ca5595ae81 /clang/lib/Frontend/CompilerInvocation.cpp
parenta04142f11f926d09059614a6170eff35a4ea6ff6 (diff)
downloadllvm-9a592d9a849dacf02ff571c81f2b3a805e9d13e5.zip
llvm-9a592d9a849dacf02ff571c81f2b3a805e9d13e5.tar.gz
llvm-9a592d9a849dacf02ff571c81f2b3a805e9d13e5.tar.bz2
[NVPTX] lower VECREDUCE min/max to 3-input on sm_100+ (#136253)
Add support for 3-input fmaxnum/fminnum/fmaximum/fminimum introduced in PTX 8.8 for sm_100+: - Use a tree reduction when 3-input operations are supported and the reduction has the `reassoc` flag. - If not on sm_100+/PTX 8.8, fallback to 2-input operations and use the default shuffle reduction.
Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions