aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineCopyPropagation.cpp
diff options
context:
space:
mode:
authorSimon Tatham <simon.tatham@arm.com>2025-02-04 08:57:54 +0000
committerGitHub <noreply@github.com>2025-02-04 08:57:54 +0000
commitb53da77c505a2d35452e161c844712afbc11f6a7 (patch)
treeb37ea1cc07580d4611445dce32d33f6332c7e941 /llvm/lib/CodeGen/MachineCopyPropagation.cpp
parentc06d0ff806b72b1cfbca6306a2bc4f5f2922b01b (diff)
downloadllvm-b53da77c505a2d35452e161c844712afbc11f6a7.zip
llvm-b53da77c505a2d35452e161c844712afbc11f6a7.tar.gz
llvm-b53da77c505a2d35452e161c844712afbc11f6a7.tar.bz2
[libc] Alternative algorithm for decimal FP printf (#123643)
The existing options for bin→dec float conversion are all based on the Ryū algorithm, which generates 9 output digits at a time using a table lookup. For users who can't afford the space cost of the table, the table-lookup subroutine is replaced with one that computes the needed table entry on demand, but the algorithm is otherwise unmodified. The performance problem with computing table entries on demand is that now you need to calculate a power of 10 for each 9 digits you output. But if you're calculating a custom power of 10 anyway, it's easier to just compute one, and multiply the _whole_ mantissa by it. This patch adds a header file alongside `float_dec_converter.h`, which replaces the whole Ryū system instead of just the table-lookup routine, implementing this alternative simpler algorithm. The result is accurate enough to satisfy (minimally) the accuracy demands of IEEE 754-2019 even in 128-bit long double. The new float128 test cases demonstrate this by testing the cases closest to the 39-digit rounding boundary. In my tests of generating 39 output digits (the maximum number supported by this algorithm) this code is also both faster and smaller than the USE_DYADIC_FLOAT version of the existing Ryū code.
Diffstat (limited to 'llvm/lib/CodeGen/MachineCopyPropagation.cpp')
0 files changed, 0 insertions, 0 deletions