First load SoftPosit.jl ...
using SoftPosit
p0 = Posit16(1.23)
Float64(p0)
p1 = Posit8(50)
Float64(p1)
(Note that 64 is the closest number in the Posit8,0 system to 50!)
p1_2 = Posit8_2(50)
Float64(p1_2)
In the Posit8,2 system there is also 48, thanks to two exponent bits! You can also initialise a posit number with a given bit pattern, in Julia a bit pattern needs to be preceeded by 0b
p2 = Posit8(0b10100101)
A posit number is always printed in its hexadecimal encoding, which you can also use directly
p3 = Posit32(0x6bb73333)
But note that the off-standard types Posit8_2
, Posit16_2
, Posit24_2
(and same, but less supported also for Posit8_1
, Posit16_1
, Posit24_1
) are internally stored as 32bit with padded zeros for the remaining bits.
Posit8_2(0x7f)
Posit16_2(0x7f00)
Sometimes it's really useful to understand the bit encoding of posits, the bitstring function returns the bit pattern as string
bitstring(p3)
To split the bit string in sign, regime, exponent and fraction bits use
bitstring(p3," ")
Works as simple as
Float64(p3)
And also on arrays
v = Posit32.(randn(5))
Float64.(v)
You can also round Posits
p4 = Posit8(1.5)
Float64(round(p4))
Which also happens when you convert back to Int
Int(p4)
Posit16(p4)
Posit32(p4)
Posit16_2(p4)
z = Complex{Posit32}(1+im)
abs(z)
Only a few promotions are implemented, we do not want to automatically promote between floats and posits and like to see them used separately with care. Promotion for Ints and Bool:
x = Posit8(2)
2x
false*x, true*x
Posit one and zeros are defined
one(Posit16), zero(Posit16)
But also -1 and not a real (or sometimes called complex infinity) that span the posit circle
minusone(Posit16), notareal(Posit16)
The largest and smallest positive number in a given Posit system are defined with the now slightly incorrect functions floatmin and floatmax
v = floatmin(Posit16), floatmax(Posit16)
Float64.(v)
You can use Julia's eps
function to return the machine epsilon for a given Posit system once called with the DataType Posit8
, Posit16
, etc.
Float64.((eps(Posit8), eps(Posit16), eps(Posit16_2), eps(Posit32)))
Which is the same as calling the eps
function with 1, however, the relative error increases away from one
Float64.((eps(one(Posit8)), eps(Posit8(2.0)), eps(Posit8(4.0))))
sqrt is implemented in the C library. Sin, cos, tan, exp, log are supported too, but they rely on a back and forth conversion to Float64.
sqrt(p3)
sin(p3), cos(p3), tan(p3), exp(p3), log(p3), log2(p3), log10(p3)
Please note that these operations are with floats not error-free rounded, see the following example
sin(π)
Which is many bit patterns away from zero.
Posit32(sin(π))
Mostly a problem of the underlying execution of sin(::Float64)
sin(Posit32(1π))
There are also functions that tell you about the sign of a given posit number, encoded as posit -1,0, or 1
sign(floatmax(Posit8)), sign(zero(Posit8)), sign(minusone(Posit8))
Note that for posits there is only one infinity, which is just called not a real. It's sign is therefore 0 (and not ±1 as for Floats with plus infinity and negative infinity)
sign(notareal(Posit8))
In contrast, the signbit function returns the actual bit as Boolean
signbit(one(Posit8)), signbit(zero(Posit8)), signbit(Posit8(-1.25)), signbit(notareal(Posit8))
Sometimes useful are the nextfloat
and prevfloat
functions, which are now for posits incorrectly named, but still
p1 = Posit8(10)
p0 = prevfloat(p1)
p2 = nextfloat(p1)
Float64.((p0,p1,p2))
That means the next smaller representable number from 10 is 8 and the next bigger one is 12 - in the 8bit posit system wihtout exponent bits. nextfloat
and prevfloat
have a wrap-around behaviour, that means after not a real follows -floatmax, just like on the posit circle
Float64(nextfloat(notareal(Posit8)))
The typical arithmetic operations +,-,*,/ are supported
p0,p1,p2,p3,p4 = Posit16.([-1,0,2,10.0,22])
p0+p1-p2*p3/p4
And also powers with positive integers
p = Posit32(2)
p^2, p^3, p^4
Comparisons are intuitive too
p0 == p0, p0 > p1, p1 < p2, p3 >= p3
Division by zero is the only way you can trigger not a real to occur in an operation
p = one(Posit8)/zero(Posit8)
p, isfinite(p)
Many other libraries should work with posits too, as long as they are written in a type-stable way in pure Julia. For example solving a linear equation system
A = Posit32.(randn(3,3))
b = Posit32.(randn(3))
Float64.(A\b)
Quick check that this actually works correctly
Float64.(A)\Float64.(b)
Quires are the posit way of generalising fused operations to an exact dot product. Exact in a sense that the rounding error only occurs at the very end when converting back to posits. For Posit8 the Quire cache has 32bit, for Posit16 Quire16 is 128bits long and Quire32 has 512bit. All quire operations therefore have to form of a fused multiply-add (fma) or a fused multiply-subtract (fms).
Quires are initialised with (same for Quire16
,Quire32
)
q = zero(Quire8)
Now adding the number 12x12 = 144 that in Posit8
would cause a saturation at 64 (floatmax)
q = fma(q,Posit8(12),Posit8(12))
and subsequently subtracting 14x10=140 yields the correct result of 4.
q = fms(q,Posit8(14),Posit8(10))
p = Posit8(q)
Float64(p)