diff options
author | Alexander Pivovarov <pivovaa@amazon.com> | 2024-07-17 23:33:52 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-07-17 23:33:52 -0700 |
commit | f36331770267501e157ac34afc3ca7d7a0bfb52c (patch) | |
tree | a9a571da0068065aa4df78767ba3c574a0991ef2 /clang/lib/CodeGen/CodeGenModule.h | |
parent | 1e6672af2497042d5dad0236c2ad9e61f879ac07 (diff) | |
download | llvm-f36331770267501e157ac34afc3ca7d7a0bfb52c.zip llvm-f36331770267501e157ac34afc3ca7d7a0bfb52c.tar.gz llvm-f36331770267501e157ac34afc3ca7d7a0bfb52c.tar.bz2 |
[APFloat] Add support for f8E4M3 IEEE 754 type (#97179)
This PR adds `f8E4M3` type to APFloat.
`f8E4M3` type follows IEEE 754 convention
```c
f8E4M3 (IEEE 754)
- Exponent bias: 7
- Maximum stored exponent value: 14 (binary 1110)
- Maximum unbiased exponent value: 14 - 7 = 7
- Minimum stored exponent value: 1 (binary 0001)
- Minimum unbiased exponent value: 1 − 7 = −6
- Precision specifies the total number of bits used for the significand (mantisa),
including implicit leading integer bit = 3 + 1 = 4
- Follows IEEE 754 conventions for representation of special values
- Has Positive and Negative zero
- Has Positive and Negative infinity
- Has NaNs
Additional details:
- Max exp (unbiased): 7
- Min exp (unbiased): -6
- Infinities (+/-): S.1111.000
- Zeros (+/-): S.0000.000
- NaNs: S.1111.{001, 010, 011, 100, 101, 110, 111}
- Max normal number: S.1110.111 = +/-2^(7) x (1 + 0.875) = +/-240
- Min normal number: S.0001.000 = +/-2^(-6)
- Max subnormal number: S.0000.111 = +/-2^(-6) x 0.875 = +/-2^(-9) x 7
- Min subnormal number: S.0000.001 = +/-2^(-6) x 0.125 = +/-2^(-9)
```
Related PRs:
- [PR-97118](https://github.com/llvm/llvm-project/pull/97118) Add f8E4M3
IEEE 754 type to mlir
Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.h')
0 files changed, 0 insertions, 0 deletions