aboutsummaryrefslogtreecommitdiff
path: root/src/zabha.adoc
blob: 26529a58d09808fa0fd437903c3a0f4fbbd3886a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
== "Zabha" Extension for Byte and Halfword Atomic Memory Operations, Version 1.0.0

The A-extension offers atomic memory operation (AMO) instructions for _words_,
_doublewords_, and _quadwords_ (only for `AMOCAS`). The absence of atomic
operations for subword data types necessitates emulation strategies. For bitwise
operations, this emulation can be performed via word-sized bitwise AMO*
instructions. For non-bitwise operations, emulation is achievable using
word-sized `LR`/`SC` instructions.

Several limitations arise from this emulation approach:

. In systems with large-scale or Non-Uniform Memory Access (NUMA)
  configurations, emulation based on `LR`/`SC` introduces issues related to
  scalability and fairness, particularly under conditions of high contention.

. Emulation of narrower AMOs through wider AMO* instructions on non-idempotent
  IO memory regions may result in unintended side effects.

. Utilizing wider AMO* instructions for emulating narrower AMOs risks activating
  extraneous breakpoints or watchpoints.

. In the absence of native support for subword atomics, compilers often resort
  to inlining code sequences to provide the required emulation. This practice
  contributes to an increase in code size, with consequent impacts on system
  performance and memory utilization.

The Zabha extension addresses these limitations by adding support for _byte_ and
_halfword_ atomic memory operations to the RISC-V Unprivileged ISA. The Zabha
extension depends upon the Zaamo standard extension.

=== Byte and Halfword Atomic Memory Operation Instructions

Zabha extension provides the `AMO[ADD|AND|OR|XOR|SWAP|MIN[U]|MAX[U]].[B|H]`
instructions. If Zacas extension is also implemented, Zabha further provides the
`AMOCAS.[B|H]` instructions.

[wavedrom, zabha-ext-wavedrom-reg,svg]
....
{reg: [ 
  {bits:  7, name: 'opcode', attr:['AMO','AMO','AMO','AMO','AMO','AMO','AMO','AMO']},
  {bits:  5, name: 'rd', attr:['dest','dest','dest','dest','dest','dest','dest','dest']},
  {bits:  3, name: 'funct3', attr:['width=0/1','width=0/1','width=0/1','width=0/1','width=0/1','width=0/1','width=0/1','width=0/1']},
  {bits:  5, name: 'rs1', attr:['addr','addr','addr','addr','addr','addr','addr','addr']},
  {bits:  5, name: 'rs2', attr:['src','src','src','src','src','src','src','src']},
  {bits:  1, name: 'rl'},
  {bits:  1, name: 'aq', attr:['ordering','ordering','ordering','ordering','ordering','ordering','ordering','ordering']},
  {bits:  5, name: 'funct5', attr:['AMOSWAP.B/H','AMOADD.B/H','AMOAND.B/H','AMOOR.B/H','AMOXOR.B/H','AMOMAX[U].B/H','AMOMIN[U].B/H','AMOCAS.B/H']},
], config:{lanes: 1, hspace:1024}} 
.... 

Byte and halfword AMOs always sign-extend the value placed in `rd`, and ignore
the stem:[XLEN-1:2^{(width + 3)}] bits of the original value in `rs2`. The
`AMOCAS.[B|H]` instructions similarly ignore the stem:[XLEN-1:2^{(width + 3)}]
bits of the original value in `rd`.

Similar to the AMOs specified in the A extension, the Zabha extension mandates
that the address contained in the `rs1` register must be naturally aligned to
the size of the operand. The same exception options as specified in the A
extension are applicable in cases where the address is not naturally aligned.

Similar to the AMOs specified in the A and Zacas extensions, the AMOs in the
Zabha extension optionally provide release consistency semantics, using the `aq`
and `rl` bits, to help implement multiprocessor synchronization.

[NOTE]
====
Zabha omits _byte_ and _halfword_ support for `LR` and `SC` due to low utility.
====