diff options
author | Princeton Ferro <pferro@nvidia.com> | 2025-08-06 21:45:21 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-08-06 21:45:21 -0700 |
commit | 9a592d9a849dacf02ff571c81f2b3a805e9d13e5 (patch) | |
tree | 358b20e399465bf7888b4471ad2b73ca5595ae81 /clang/lib/Frontend/CompilerInvocation.cpp | |
parent | a04142f11f926d09059614a6170eff35a4ea6ff6 (diff) | |
download | llvm-9a592d9a849dacf02ff571c81f2b3a805e9d13e5.zip llvm-9a592d9a849dacf02ff571c81f2b3a805e9d13e5.tar.gz llvm-9a592d9a849dacf02ff571c81f2b3a805e9d13e5.tar.bz2 |
[NVPTX] lower VECREDUCE min/max to 3-input on sm_100+ (#136253)
Add support for 3-input fmaxnum/fminnum/fmaximum/fminimum introduced in
PTX 8.8 for sm_100+:
- Use a tree reduction when 3-input operations are supported and the
reduction has the `reassoc` flag.
- If not on sm_100+/PTX 8.8, fallback to 2-input operations and use the
default shuffle reduction.
Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')
0 files changed, 0 insertions, 0 deletions