[AArch64] Add all SME2.1 instructions Assembly/Disassembly

This patch adds a new feature flag: sme-f16f16 to represent FEAT_SME-F16F16 This patch add the following instructions: SME2.1 stand alone instructions: MOVAZ (array to vector, four registers): Move and zero four ZA single-vector groups to vector registers. (array to vector, two registers): Move and zero two ZA single-vector groups to vector registers. (tile to vector, four registers): Move and zero four ZA tile slices to vector registers. (tile to vector, single): Move and zero ZA tile slice to vector register. (tile to vector, two registers): Move and zero two ZA tile slices to vector registers. LUTI2 (Strided four registers): Lookup table read with 2-bit indexes. (Strided two registers): Lookup table read with 2-bit indexes. LUTI4 (Strided four registers): Lookup table read with 4-bit indexes. (Strided two registers): Lookup table read with 4-bit indexes. ZERO (double-vector): Zero ZA double-vector groups. (quad-vector): Zero ZA quad-vector groups. (single-vector): Zero ZA single-vector groups. SME2p1 and SME-F16F16: All instructions are half precision elements: FADD: Floating-point add multi-vector to ZA array vector accumulators. FSUB: Floating-point subtract multi-vector from ZA array vector accumulators. FMLA (multiple and indexed vector): Multi-vector floating-point fused multiply-add by indexed element. (multiple and single vector): Multi-vector floating-point fused multiply-add by vector. (multiple vectors): Multi-vector floating-point fused multiply-add. FMLS (multiple and indexed vector): Multi-vector floating-point fused multiply-subtract by indexed element. (multiple and single vector): Multi-vector floating-point fused multiply-subtract by vector. (multiple vectors): Multi-vector floating-point fused multiply-subtract. FCVT (widening): Multi-vector floating-point convert from half-precision to single-precision (in-order). FCVTL: Multi-vector floating-point convert from half-precision to deinterleaved single-precision. FMOPA (non-widening): Floating-point outer product and accumulate. FMOPS (non-widening): Floating-point outer product and subtract. SME2p1 and B16B16: BFADD: BFloat16 floating-point add multi-vector to ZA array vector accumulators. BFSUB: BFloat16 floating-point subtract multi-vector from ZA array vector accumulators. BFCLAMP: Multi-vector BFloat16 floating-point clamp to minimum/maximum number. BFMLA (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-add by indexed element. (multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-add by vector. (multiple vectors): Multi-vector BFloat16 floating-point fused multiply-add. BFMLS (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-subtract by indexed element. (multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-subtract by vector. (multiple vectors): Multi-vector BFloat16 floating-point fused multiply-subtract. BFMAX (multiple and single vector): Multi-vector BFloat16 floating-point maximum by vector. (multiple vectors): Multi-vector BFloat16 floating-point maximum. BFMAXNM (multiple and single vector): Multi-vector BFloat16 floating-point maximum number by vector. (multiple vectors): Multi-vector BFloat16 floating-point maximum number. BFMIN (multiple and single vector): Multi-vector BFloat16 floating-point minimum by vector. (multiple vectors): Multi-vector BFloat16 floating-point minimum. BFMINNM (multiple and single vector): Multi-vector BFloat16 floating-point minimum number by vector. (multiple vectors): Multi-vector BFloat16 floating-point minimum number. BFMOPA (non-widening): BFloat16 floating-point outer product and accumulate. BFMOPS (non-widening): BFloat16 floating-point outer product and subtract. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D137571
author: Caroline Concatto <caroline.concatto@arm.com> 2022-11-03 12:18:20 +0000
committer: Caroline Concatto <caroline.concatto@arm.com> 2022-11-14 14:56:16 +0000
commit: 3eacda4547c59c3daa2daf275321c8013eb485cd (patch)
tree: 99a6998d6c4d7a92621cb7626d8b673032288f54 /llvm/unittests/Support/TargetParserTest.cpp
parent: 458ae539dffd0ec2c02d3f4121b65b54bfd655ab (diff)
download: llvm-3eacda4547c59c3daa2daf275321c8013eb485cd.zip
llvm-3eacda4547c59c3daa2daf275321c8013eb485cd.tar.gz
llvm-3eacda4547c59c3daa2daf275321c8013eb485cd.tar.bz2
1 files changed, 3 insertions, 1 deletions
diff --git a/llvm/unittests/Support/TargetParserTest.cpp b/llvm/unittests/Support/TargetParserTest.cpp
index d94005e..8cd6b58 100644
--- a/llvm/unittests/Support/TargetParserTest.cpp
+++ b/llvm/unittests/Support/TargetParserTest.cpp
@@ -1598,7 +1598,7 @@ TEST(TargetParserTest, AArch64ExtensionFeatures) {
       AArch64::AEK_SME,     AArch64::AEK_SMEF64F64, AArch64::AEK_SMEI16I64,
       AArch64::AEK_SME2,    AArch64::AEK_HBC,      AArch64::AEK_MOPS,
       AArch64::AEK_PERFMON, AArch64::AEK_SVE2p1,   AArch64::AEK_SME2p1,
-      AArch64::AEK_B16B16};
+      AArch64::AEK_B16B16,  AArch64::AEK_SMEF16F16};
 
   std::vector<StringRef> Features;
 
@@ -1657,6 +1657,7 @@ TEST(TargetParserTest, AArch64ExtensionFeatures) {
   EXPECT_TRUE(llvm::is_contained(Features, "+sme"));
   EXPECT_TRUE(llvm::is_contained(Features, "+sme-f64f64"));
   EXPECT_TRUE(llvm::is_contained(Features, "+sme-i16i64"));
+  EXPECT_TRUE(llvm::is_contained(Features, "+sme-f16f16"));
   EXPECT_TRUE(llvm::is_contained(Features, "+sme2"));
   EXPECT_TRUE(llvm::is_contained(Features, "+sme2p1"));
   EXPECT_TRUE(llvm::is_contained(Features, "+hbc"));
@@ -1739,6 +1740,7 @@ TEST(TargetParserTest, AArch64ArchExtFeature) {
       {"sme", "nosme", "+sme", "-sme"},
       {"sme-f64f64", "nosme-f64f64", "+sme-f64f64", "-sme-f64f64"},
       {"sme-i16i64", "nosme-i16i64", "+sme-i16i64", "-sme-i16i64"},
+      {"sme-f16f16", "nosme-f16f16", "+sme-f16f16", "-sme-f16f16"},
       {"sme2", "nosme2", "+sme2", "-sme2"},
       {"sme2p1", "nosme2p1", "+sme2p1", "-sme2p1"},
       {"hbc", "nohbc", "+hbc", "-hbc"},
author	Caroline Concatto <caroline.concatto@arm.com>	2022-11-03 12:18:20 +0000
committer	Caroline Concatto <caroline.concatto@arm.com>	2022-11-14 14:56:16 +0000
commit	3eacda4547c59c3daa2daf275321c8013eb485cd (patch)
tree	99a6998d6c4d7a92621cb7626d8b673032288f54 /llvm/unittests/Support/TargetParserTest.cpp
parent	458ae539dffd0ec2c02d3f4121b65b54bfd655ab (diff)
download	llvm-3eacda4547c59c3daa2daf275321c8013eb485cd.zip llvm-3eacda4547c59c3daa2daf275321c8013eb485cd.tar.gz llvm-3eacda4547c59c3daa2daf275321c8013eb485cd.tar.bz2