diff options
author | David Sherwood <david.sherwood@arm.com> | 2021-07-07 13:18:20 +0100 |
---|---|---|
committer | David Sherwood <david.sherwood@arm.com> | 2021-07-26 10:26:06 +0100 |
commit | 0aff1798b5721d5f95d16f465b99d357012bb8d1 (patch) | |
tree | 33ef05a5ac939f76d568dfae922f3d6ccc8f540d /llvm/lib/CodeGen/MachineFunction.cpp | |
parent | f924a3d47492b7b586ccfd1333ca086a7e2d88b2 (diff) | |
download | llvm-0aff1798b5721d5f95d16f465b99d357012bb8d1.zip llvm-0aff1798b5721d5f95d16f465b99d357012bb8d1.tar.gz llvm-0aff1798b5721d5f95d16f465b99d357012bb8d1.tar.bz2 |
[Analysis] Add simple cost model for strict (in-order) reductions
I have added a new FastMathFlags parameter to getArithmeticReductionCost
to indicate what type of reduction we are performing:
1. Tree-wise. This is the typical fast-math reduction that involves
continually splitting a vector up into halves and adding each
half together until we get a scalar result. This is the default
behaviour for integers, whereas for floating point we only do this
if reassociation is allowed.
2. Ordered. This now allows us to estimate the cost of performing
a strict vector reduction by treating it as a series of scalar
operations in lane order. This is the case when FP reassociation
is not permitted. For scalable vectors this is more difficult
because at compile time we do not know how many lanes there are,
and so we use the worst case maximum vscale value.
I have also fixed getTypeBasedIntrinsicInstrCost to pass in the
FastMathFlags, which meant fixing up some X86 tests where we always
assumed the vector.reduce.fadd/mul intrinsics were 'fast'.
New tests have been added here:
Analysis/CostModel/AArch64/reduce-fadd.ll
Analysis/CostModel/AArch64/sve-intrinsics.ll
Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll
Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll
Differential Revision: https://reviews.llvm.org/D105432
Diffstat (limited to 'llvm/lib/CodeGen/MachineFunction.cpp')
0 files changed, 0 insertions, 0 deletions