How to use SoftPosit.jl - some examples

First load SoftPosit.jl ...

In [ ]:
using SoftPosit

1. Conversions

from Float, Int, bit pattern or hexadecimal

In [10]:
p0 = Posit16(1.23)
Float64(p0)
Out[10]:
1.22998046875
In [12]:
p1 = Posit8(50)
Float64(p1)
Out[12]:
64.0

(Note that 64 is the closest number in the Posit8,0 system to 50!)

In [15]:
p1_2 = Posit8_2(50)
Float64(p1_2)
Out[15]:
48.0

In the Posit8,2 system there is also 48, thanks to two exponent bits! You can also initialise a posit number with a given bit pattern, in Julia a bit pattern needs to be preceeded by 0b

In [16]:
p2 = Posit8(0b10100101)
Out[16]:
Posit8(0xa5)

A posit number is always printed in its hexadecimal encoding, which you can also use directly

In [21]:
p3 = Posit32(0x6bb73333)
Out[21]:
Posit32(0x6bb73333)

But note that the off-standard types Posit8_2, Posit16_2, Posit24_2 (and same, but less supported also for Posit8_1, Posit16_1, Posit24_1) are internally stored as 32bit with padded zeros for the remaining bits.

In [17]:
Posit8_2(0x7f)
Out[17]:
Posit8_2(0x7f000000)
In [19]:
Posit16_2(0x7f00)
Out[19]:
Posit16_2(0x7f000000)

Sometimes it's really useful to understand the bit encoding of posits, the bitstring function returns the bit pattern as string

In [22]:
bitstring(p3)
Out[22]:
"01101011101101110011001100110011"

To split the bit string in sign, regime, exponent and fraction bits use

In [23]:
bitstring(p3," ")
Out[23]:
"0 110 10 11101101110011001100110011"

Back conversion to Floats

Works as simple as

In [24]:
Float64(p3)
Out[24]:
123.44999980926514

And also on arrays

In [25]:
v = Posit32.(randn(5))
Float64.(v)
Out[25]:
5-element Array{Float64,1}:
 -0.05123545182868838
 -1.7403574511408806 
 -1.1406445801258087 
  0.18416725378483534
 -1.0025449469685555 

Back conversion to Ints

You can also round Posits

In [26]:
p4 = Posit8(1.5)
Out[26]:
Posit8(0x50)
In [27]:
Float64(round(p4))
Out[27]:
2.0

Which also happens when you convert back to Int

In [28]:
Int(p4)
Out[28]:
2

Conversion between posits

In [30]:
Posit16(p4)
Out[30]:
Posit16(0x4800)
In [31]:
Posit32(p4)
Out[31]:
Posit32(0x44000000)
In [32]:
Posit16_2(p4)
Out[32]:
Posit16_2(0x44000000)

Complex numbers

In [33]:
z = Complex{Posit32}(1+im)
Out[33]:
Posit32(0x40000000) + Posit32(0x40000000)im
In [34]:
abs(z)
Out[34]:
Posit32(0x43504f33)

Promotion

Only a few promotions are implemented, we do not want to automatically promote between floats and posits and like to see them used separately with care. Promotion for Ints and Bool:

In [36]:
x = Posit8(2)
2x
Out[36]:
Posit8(0x70)
In [38]:
false*x, true*x
Out[38]:
(Posit8(0x00), Posit8(0x60))

2. Constants

Posit one and zeros are defined

In [49]:
one(Posit16), zero(Posit16)
Out[49]:
(Posit16(0x4000), Posit16(0x0000))

But also -1 and not a real (or sometimes called complex infinity) that span the posit circle

In [51]:
minusone(Posit16), notareal(Posit16)
Out[51]:
(Posit16(0xc000), Posit16(0x8000))

The largest and smallest positive number in a given Posit system are defined with the now slightly incorrect functions floatmin and floatmax

In [53]:
v = floatmin(Posit16), floatmax(Posit16)
Out[53]:
(Posit16(0x0001), Posit16(0x7fff))
In [66]:
Float64.(v)
Out[66]:
(3.725290298461914e-9, 2.68435456e8)

You can use Julia's eps function to return the machine epsilon for a given Posit system once called with the DataType Posit8, Posit16, etc.

In [68]:
Float64.((eps(Posit8), eps(Posit16), eps(Posit16_2), eps(Posit32)))
Out[68]:
(0.03125, 0.000244140625, 0.00048828125, 7.450580596923828e-9)

Which is the same as calling the eps function with 1, however, the relative error increases away from one

In [69]:
Float64.((eps(one(Posit8)), eps(Posit8(2.0)), eps(Posit8(4.0))))
Out[69]:
(0.03125, 0.125, 0.5)

3. Functions of one argument

sqrt is implemented in the C library. Sin, cos, tan, exp, log are supported too, but they rely on a back and forth conversion to Float64.

In [39]:
sqrt(p3)
Out[39]:
Posit32(0x5b1c5dc1)
In [61]:
sin(p3), cos(p3), tan(p3), exp(p3), log(p3), log2(p3), log10(p3)
Out[61]:
(Posit32(0xc331bf75), Posit32(0xc668565d), Posit32(0x42ae0833), Posit32(0x7fffffff), Posit32(0x51a1b549), Posit32(0x55e543d1), Posit32(0x485dafd7))

Please note that these operations are with floats not error-free rounded, see the following example

In [44]:
sin(π)
Out[44]:
1.2246467991473532e-16

Which is many bit patterns away from zero.

In [45]:
Posit32(sin(π))
Out[45]:
Posit32(0x0001c699)

Mostly a problem of the underlying execution of sin(::Float64)

In [46]:
sin(Posit32(1π))
Out[46]:
Posit32(0x00710b46)

There are also functions that tell you about the sign of a given posit number, encoded as posit -1,0, or 1

In [58]:
sign(floatmax(Posit8)), sign(zero(Posit8)), sign(minusone(Posit8))
Out[58]:
(Posit8(0x40), Posit8(0x00), Posit8(0xc0))

Note that for posits there is only one infinity, which is just called not a real. It's sign is therefore 0 (and not ±1 as for Floats with plus infinity and negative infinity)

In [2]:
sign(notareal(Posit8))
Out[2]:
Posit8(0x80)

In contrast, the signbit function returns the actual bit as Boolean

In [60]:
signbit(one(Posit8)), signbit(zero(Posit8)), signbit(Posit8(-1.25)), signbit(notareal(Posit8))
Out[60]:
(false, false, true, true)

Sometimes useful are the nextfloat and prevfloat functions, which are now for posits incorrectly named, but still

In [76]:
p1 = Posit8(10)
p0 = prevfloat(p1)
p2 = nextfloat(p1)
Float64.((p0,p1,p2))
Out[76]:
(8.0, 10.0, 12.0)

That means the next smaller representable number from 10 is 8 and the next bigger one is 12 - in the 8bit posit system wihtout exponent bits. nextfloat and prevfloat have a wrap-around behaviour, that means after not a real follows -floatmax, just like on the posit circle

In [78]:
Float64(nextfloat(notareal(Posit8)))
Out[78]:
-64.0

4. Functions of two arguments

The typical arithmetic operations +,-,*,/ are supported

In [82]:
p0,p1,p2,p3,p4 = Posit16.([-1,0,2,10.0,22])
Out[82]:
5-element Array{Posit16,1}:
 Posit16(0xc000)
 Posit16(0x0000)
 Posit16(0x5000)
 Posit16(0x6a00)
 Posit16(0x7180)
In [65]:
p0+p1-p2*p3/p4
Out[65]:
Posit16(0xb174)

And also powers with positive integers

In [81]:
p = Posit32(2)
p^2, p^3, p^4
Out[81]:
(Posit32(0x50000000), Posit32(0x58000000), Posit32(0x60000000))

Comparisons are intuitive too

In [83]:
p0 == p0, p0 > p1, p1 < p2, p3 >= p3
Out[83]:
(true, false, true, true)

Division by zero is the only way you can trigger not a real to occur in an operation

In [85]:
p = one(Posit8)/zero(Posit8)
p, isfinite(p)
Out[85]:
(Posit8(0x80), false)

5. Other libraries / packages

Many other libraries should work with posits too, as long as they are written in a type-stable way in pure Julia. For example solving a linear equation system

In [86]:
A = Posit32.(randn(3,3))
b = Posit32.(randn(3))
Float64.(A\b)
Out[86]:
3-element Array{Float64,1}:
 -0.2591354679316282
  1.5956060588359833
 -1.6174840703606606

Quick check that this actually works correctly

In [87]:
Float64.(A)\Float64.(b)
Out[87]:
3-element Array{Float64,1}:
 -0.25913547220788774
  1.5956060579522418 
 -1.6174840845064522 

6. Quires

Quires are the posit way of generalising fused operations to an exact dot product. Exact in a sense that the rounding error only occurs at the very end when converting back to posits. For Posit8 the Quire cache has 32bit, for Posit16 Quire16 is 128bits long and Quire32 has 512bit. All quire operations therefore have to form of a fused multiply-add (fma) or a fused multiply-subtract (fms).

Quires are initialised with (same for Quire16,Quire32)

In [88]:
q = zero(Quire8)
Out[88]:
Quire8(0x00000000)

Now adding the number 12x12 = 144 that in Posit8 would cause a saturation at 64 (floatmax)

In [89]:
q = fma(q,Posit8(12),Posit8(12))
Out[89]:
Quire8(0x00090000)

and subsequently subtracting 14x10=140 yields the correct result of 4.

In [90]:
q = fms(q,Posit8(14),Posit8(10))
Out[90]:
Quire8(0x00004000)
In [91]:
p = Posit8(q)
Out[91]:
Posit8(0x70)
In [92]:
Float64(p)
Out[92]:
4.0