riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Sergey Kozub <skozub@nvidia.com>	2024-09-16 21:09:27 +0200
committer	GitHub <noreply@github.com>	2024-09-16 21:09:27 +0200
commit	73d83f20c9734a3fe004f2607606b64ab20998f0 (patch)
tree	c779440ea22edd1024762fcd936ea4e86e489df1 /llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp
parent	09e3a360581dc36d0820d3fb6da9bd7cfed87b5d (diff)
download	llvm-73d83f20c9734a3fe004f2607606b64ab20998f0.zip llvm-73d83f20c9734a3fe004f2607606b64ab20998f0.tar.gz llvm-73d83f20c9734a3fe004f2607606b64ab20998f0.tar.bz2

[MLIR] Add f6E2M3FN type (#107999)

This PR adds `f6E2M3FN` type to mlir. `f6E2M3FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 6-bit floating point number with bit layout S1E2M3. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f6E2M3FN - Exponent bias: 1 - Maximum stored exponent value: 3 (binary 11) - Maximum unbiased exponent value: 3 - 1 = 2 - Minimum stored exponent value: 1 (binary 01) - Minimum unbiased exponent value: 1 − 1 = 0 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.00.000 - Max normal number: S.11.111 = ±2^(2) x (1 + 0.875) = ±7.5 - Min normal number: S.01.000 = ±2^(0) = ±1.0 - Max subnormal number: S.00.111 = ±2^(0) x 0.875 = ±0.875 - Min subnormal number: S.00.001 = ±2^(0) x 0.125 = ±0.125 ``` Related PRs: - [PR-94735](https://github.com/llvm/llvm-project/pull/94735) [APFloat] Add APFloat support for FP6 data types - [PR-105573](https://github.com/llvm/llvm-project/pull/105573) [MLIR] Add f6E3M2FN type - was used as a template for this PR

Diffstat (limited to 'llvm/lib/Target/WebAssembly/Disassembler/WebAssemblyDisassembler.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: