diff options
author | Simon Tatham <simon.tatham@arm.com> | 2025-02-04 08:57:54 +0000 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-02-04 08:57:54 +0000 |
commit | b53da77c505a2d35452e161c844712afbc11f6a7 (patch) | |
tree | b37ea1cc07580d4611445dce32d33f6332c7e941 /llvm/lib/CodeGen/MachineCopyPropagation.cpp | |
parent | c06d0ff806b72b1cfbca6306a2bc4f5f2922b01b (diff) | |
download | llvm-b53da77c505a2d35452e161c844712afbc11f6a7.zip llvm-b53da77c505a2d35452e161c844712afbc11f6a7.tar.gz llvm-b53da77c505a2d35452e161c844712afbc11f6a7.tar.bz2 |
[libc] Alternative algorithm for decimal FP printf (#123643)
The existing options for bin→dec float conversion are all based on the
Ryū algorithm, which generates 9 output digits at a time using a table
lookup. For users who can't afford the space cost of the table, the
table-lookup subroutine is replaced with one that computes the needed
table entry on demand, but the algorithm is otherwise unmodified.
The performance problem with computing table entries on demand is that
now you need to calculate a power of 10 for each 9 digits you output.
But if you're calculating a custom power of 10 anyway, it's easier to
just compute one, and multiply the _whole_ mantissa by it.
This patch adds a header file alongside `float_dec_converter.h`, which
replaces the whole Ryū system instead of just the table-lookup routine,
implementing this alternative simpler algorithm. The result is accurate
enough to satisfy (minimally) the accuracy demands of IEEE 754-2019 even
in 128-bit long double. The new float128 test cases demonstrate this by
testing the cases closest to the 39-digit rounding boundary.
In my tests of generating 39 output digits (the maximum number supported
by this algorithm) this code is also both faster and smaller than the
USE_DYADIC_FLOAT version of the existing Ryū code.
Diffstat (limited to 'llvm/lib/CodeGen/MachineCopyPropagation.cpp')
0 files changed, 0 insertions, 0 deletions