diff options
author | Richard Henderson <richard.henderson@linaro.org> | 2024-07-01 10:41:45 -0700 |
---|---|---|
committer | Richard Henderson <richard.henderson@linaro.org> | 2024-07-01 10:41:45 -0700 |
commit | c80a339587fe4148292c260716482dd2f86d4476 (patch) | |
tree | d98724b8d23d91da62f2da44905f0dc5ee2cf031 | |
parent | 1152a0414944f03231f3177207d379d58125890e (diff) | |
parent | 58c782de557beb496bfb4c5ade721bbbd2480c72 (diff) | |
download | qemu-c80a339587fe4148292c260716482dd2f86d4476.zip qemu-c80a339587fe4148292c260716482dd2f86d4476.tar.gz qemu-c80a339587fe4148292c260716482dd2f86d4476.tar.bz2 |
Merge tag 'pull-target-arm-20240701' of https://git.linaro.org/people/pmaydell/qemu-arm into staging
target-arm queue:
* tests/avocado: update firmware for sbsa-ref and use all cores
* hw/arm/smmu-common: Replace smmu_iommu_mr with smmu_find_sdev
* arm: Fix VCMLA Dd, Dn, Dm[idx]
* arm: Fix SQDMULH (by element) with Q=0
* arm: Fix FJCVTZS vs flush-to-zero
* arm: More conversion of A64 AdvSIMD to decodetree
* arm: Enable FEAT_Debugv8p8 for -cpu max
* MAINTAINERS: Update family name for Patrick Leis
* hw/arm/xilinx_zynq: Add boot-mode property
* docs/system/arm: Add a doc for zynq board
* hw/misc: In STM32L4x5 EXTI, correct configurable interrupts
* tests/qtest: fix minor issues in STM32L4x5 tests
# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmaC1BMZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3nDOEACCoewjO2FJ4RFXMSmgr0Zf
# jxWliu7osw7oeG4ZNq1+xMiXeW0vyS54eW41TMki1f98N/yK8v55BM8kBBvDvZaz
# R5DUXpN+MtwD9A62md3B2c4mFXHqk1UOGbKi4btbtFj4lS8pV51mPmApzBUr2iTj
# w6dCLciLOt87NWgtLECXsZ3evn+VlTRc+Hmfp1M/C/Rf2Qx3zis/CFHGQsZLGwzG
# 2WhTpU1BKeOfsQa1VbSX6un14d72/JATFZN3rSgMbOEbvsCEeP+rnkzX57ejGyxV
# 4DUx69gEAqS5bOfkQHLwy82WsunD/oIgp+GpYaYgINHzh6UkEsPoymrHAaPgV1Vh
# g0TaBtbv2p89RFY1C2W2Mi4ICQ14a+oIV9FPvDsOE8Wq+wDAy/ZxZs7G6flxqods
# s4JvcMqB3kUNBZaMsFVXTKdqT1PufICS+gx0VsKdKDwXcOHwMS10nTlEOPzqvoBA
# phAsEbjnjWVhf03XTfCus+l5NT96lswCzPcUovb3CitSc2A1KUye3TyzHnxIqmOt
# Owcl+Oiso++cgYzr/BCveTAYKYoRZzVcq5jCl4bBUH/8sLrRDbT0cpFpcMk72eE9
# VhR00kbkDfL3nKrulLsG8FeUlisX5+oGb3G5AdPtU9sqJPJMmBGaF+KniI0wi7VN
# 5teHq08upLMF5JAjiKzZIA==
# =faXD
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 01 Jul 2024 09:06:43 AM PDT
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg: aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
* tag 'pull-target-arm-20240701' of https://git.linaro.org/people/pmaydell/qemu-arm: (29 commits)
tests/qtest: Ensure STM32L4x5 EXTI state is correct at the end of QTests
hw/misc: In STM32L4x5 EXTI, correct configurable interrupts
tests/qtest: Fix STM32L4x5 SYSCFG irq line 15 state assumption
docs/system/arm: Add a doc for zynq board
hw/arm/xilinx_zynq: Add boot-mode property
hw/misc/zynq_slcr: Add boot-mode property
MAINTAINERS: Update my family name
target/arm: Enable FEAT_Debugv8p8 for -cpu max
target/arm: Move initialization of debug ID registers
target/arm: Fix indentation
target/arm: Delete dead code from disas_simd_indexed
target/arm: Convert FCMLA to decodetree
target/arm: Convert FCADD to decodetree
target/arm: Add data argument to do_fp3_vector
target/arm: Convert BFMMLA, SMMLA, UMMLA, USMMLA to decodetree
target/arm: Convert BFMLALB, BFMLALT to decodetree
target/arm: Convert BFDOT to decodetree
target/arm: Convert SUDOT, USDOT to decodetree
target/arm: Convert SDOT, UDOT to decodetree
target/arm: Convert SQRDMLAH, SQRDMLSH to decodetree
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
32 files changed, 967 insertions, 641 deletions
diff --git a/MAINTAINERS b/MAINTAINERS index 19f67dc..6725913 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1033,6 +1033,7 @@ F: hw/adc/zynq-xadc.c F: include/hw/misc/zynq_slcr.h F: include/hw/adc/zynq-xadc.h X: hw/ssi/xilinx_* +F: docs/system/arm/xlnx-zynq.rst Xilinx ZynqMP and Versal M: Alistair Francis <alistair@alistair23.me> @@ -2496,7 +2497,7 @@ F: hw/net/tulip.c F: hw/net/tulip.h pca954x -M: Patrick Venture <venture@google.com> +M: Patrick Leis <venture@google.com> S: Maintained F: hw/i2c/i2c_mux_pca954x.c F: include/hw/i2c/i2c_mux_pca954x.h diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst index 1a06a5f..3ab6e72 100644 --- a/docs/system/arm/emulation.rst +++ b/docs/system/arm/emulation.rst @@ -41,6 +41,7 @@ the following architecture extensions: - FEAT_Debugv8p1 (Debug with VHE) - FEAT_Debugv8p2 (Debug changes for v8.2) - FEAT_Debugv8p4 (Debug changes for v8.4) +- FEAT_Debugv8p8 (Debug changes for v8.8) - FEAT_DotProd (Advanced SIMD dot product instructions) - FEAT_DoubleFault (Double Fault Extension) - FEAT_E0PD (Preventing EL0 access to halves of address maps) diff --git a/docs/system/arm/xlnx-zynq.rst b/docs/system/arm/xlnx-zynq.rst new file mode 100644 index 0000000..ade18a3 --- /dev/null +++ b/docs/system/arm/xlnx-zynq.rst @@ -0,0 +1,47 @@ +Xilinx Zynq board (``xilinx-zynq-a9``) +====================================== +The Zynq 7000 family is based on the AMD SoC architecture. These products +integrate a feature-rich dual or single-core Arm Cortex-A9 MPCore based +processing system (PS) and AMD programmable logic (PL) in a single device. + +More details here: +https://docs.amd.com/r/en-US/ug585-zynq-7000-SoC-TRM/Zynq-7000-SoC-Technical-Reference-Manual + +QEMU xilinx-zynq-a9 board supports following devices: + - A9 MPCORE + - cortex-a9 + - GIC v1 + - Generic timer + - wdt + - OCM 256KB + - SMC SRAM@0xe2000000 64MB + - Zynq SLCR + - SPI x2 + - QSPI + - UART + - TTC x2 + - Gigabit Ethernet Controller x2 + - SD Controller x2 + - XADC + - Arm PrimeCell DMA Controller + - DDR Memory + - USB 2.0 x2 + +Running +""""""" +Direct Linux boot of a generic ARM upstream Linux kernel: + +.. code-block:: bash + + $ qemu-system-aarch64 -M xilinx-zynq-a9 \ + -dtb zynq-zc702.dtb -serial null -serial mon:stdio \ + -display none -m 1024 \ + -initrd rootfs.cpio.gz -kernel zImage + +For configuring the boot-mode provide the following on the command line: + +.. code-block:: bash + + -machine boot-mode=qspi + +Supported values are jtag, sd, qspi, nor. diff --git a/docs/system/target-arm.rst b/docs/system/target-arm.rst index 870d30e..7b99272 100644 --- a/docs/system/target-arm.rst +++ b/docs/system/target-arm.rst @@ -109,6 +109,7 @@ undocumented; you can get a complete list by running arm/virt arm/xenpvh arm/xlnx-versal-virt + arm/xlnx-zynq Emulated CPU architecture support ================================= diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c index 1695d8b..ac153a9 100644 --- a/hw/arm/bcm2835_peripherals.c +++ b/hw/arm/bcm2835_peripherals.c @@ -116,6 +116,10 @@ static void raspi_peripherals_base_init(Object *obj) object_property_add_const_link(OBJECT(&s->fb), "dma-mr", OBJECT(&s->gpu_bus_mr)); + /* OTP */ + object_initialize_child(obj, "bcm2835-otp", &s->otp, + TYPE_BCM2835_OTP); + /* Property channel */ object_initialize_child(obj, "property", &s->property, TYPE_BCM2835_PROPERTY); @@ -128,6 +132,8 @@ static void raspi_peripherals_base_init(Object *obj) OBJECT(&s->fb)); object_property_add_const_link(OBJECT(&s->property), "dma-mr", OBJECT(&s->gpu_bus_mr)); + object_property_add_const_link(OBJECT(&s->property), "otp", + OBJECT(&s->otp)); /* Extended Mass Media Controller */ object_initialize_child(obj, "sdhci", &s->sdhci, TYPE_SYSBUS_SDHCI); @@ -374,6 +380,14 @@ void bcm_soc_peripherals_common_realize(DeviceState *dev, Error **errp) sysbus_connect_irq(SYS_BUS_DEVICE(&s->fb), 0, qdev_get_gpio_in(DEVICE(&s->mboxes), MBOX_CHAN_FB)); + /* OTP */ + if (!sysbus_realize(SYS_BUS_DEVICE(&s->otp), errp)) { + return; + } + + memory_region_add_subregion(&s->peri_mr, OTP_OFFSET, + sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->otp), 0)); + /* Property channel */ if (!sysbus_realize(SYS_BUS_DEVICE(&s->property), errp)) { return; @@ -500,7 +514,6 @@ void bcm_soc_peripherals_common_realize(DeviceState *dev, Error **errp) create_unimp(s, &s->i2s, "bcm2835-i2s", I2S_OFFSET, 0x100); create_unimp(s, &s->smi, "bcm2835-smi", SMI_OFFSET, 0x100); create_unimp(s, &s->bscsl, "bcm2835-spis", BSC_SL_OFFSET, 0x100); - create_unimp(s, &s->otp, "bcm2835-otp", OTP_OFFSET, 0x80); create_unimp(s, &s->dbus, "bcm2835-dbus", DBUS_OFFSET, 0x8000); create_unimp(s, &s->ave0, "bcm2835-ave0", AVE0_OFFSET, 0x8000); create_unimp(s, &s->v3d, "bcm2835-v3d", V3D_OFFSET, 0x1000); diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c index 1ce706b..b6601cc 100644 --- a/hw/arm/smmu-common.c +++ b/hw/arm/smmu-common.c @@ -620,20 +620,16 @@ static const PCIIOMMUOps smmu_ops = { .get_address_space = smmu_find_add_as, }; -IOMMUMemoryRegion *smmu_iommu_mr(SMMUState *s, uint32_t sid) +SMMUDevice *smmu_find_sdev(SMMUState *s, uint32_t sid) { uint8_t bus_n, devfn; SMMUPciBus *smmu_bus; - SMMUDevice *smmu; bus_n = PCI_BUS_NUM(sid); smmu_bus = smmu_find_smmu_pcibus(s, bus_n); if (smmu_bus) { devfn = SMMU_PCI_DEVFN(sid); - smmu = smmu_bus->pbdev[devfn]; - if (smmu) { - return &smmu->iommu; - } + return smmu_bus->pbdev[devfn]; } return NULL; } diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c index 2d1e0d5..445e04d 100644 --- a/hw/arm/smmuv3.c +++ b/hw/arm/smmuv3.c @@ -1218,20 +1218,18 @@ static int smmuv3_cmdq_consume(SMMUv3State *s) case SMMU_CMD_CFGI_STE: { uint32_t sid = CMD_SID(&cmd); - IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid); - SMMUDevice *sdev; + SMMUDevice *sdev = smmu_find_sdev(bs, sid); if (CMD_SSEC(&cmd)) { cmd_error = SMMU_CERROR_ILL; break; } - if (!mr) { + if (!sdev) { break; } trace_smmuv3_cmdq_cfgi_ste(sid); - sdev = container_of(mr, SMMUDevice, iommu); smmuv3_flush_config(sdev); break; @@ -1260,20 +1258,18 @@ static int smmuv3_cmdq_consume(SMMUv3State *s) case SMMU_CMD_CFGI_CD_ALL: { uint32_t sid = CMD_SID(&cmd); - IOMMUMemoryRegion *mr = smmu_iommu_mr(bs, sid); - SMMUDevice *sdev; + SMMUDevice *sdev = smmu_find_sdev(bs, sid); if (CMD_SSEC(&cmd)) { cmd_error = SMMU_CERROR_ILL; break; } - if (!mr) { + if (!sdev) { break; } trace_smmuv3_cmdq_cfgi_cd(sid); - sdev = container_of(mr, SMMUDevice, iommu); smmuv3_flush_config(sdev); break; } diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c index c79661b..3c56b9a 100644 --- a/hw/arm/xilinx_zynq.c +++ b/hw/arm/xilinx_zynq.c @@ -38,6 +38,7 @@ #include "qom/object.h" #include "exec/tswap.h" #include "target/arm/cpu-qom.h" +#include "qapi/visitor.h" #define TYPE_ZYNQ_MACHINE MACHINE_TYPE_NAME("xilinx-zynq-a9") OBJECT_DECLARE_SIMPLE_TYPE(ZynqMachineState, ZYNQ_MACHINE) @@ -90,6 +91,7 @@ struct ZynqMachineState { MachineState parent; Clock *ps_clk; ARMCPU *cpu[ZYNQ_MAX_CPUS]; + uint8_t boot_mode; }; static void zynq_write_board_setup(ARMCPU *cpu, @@ -176,6 +178,27 @@ static inline int zynq_init_spi_flashes(uint32_t base_addr, qemu_irq irq, return unit; } +static void zynq_set_boot_mode(Object *obj, const char *str, + Error **errp) +{ + ZynqMachineState *m = ZYNQ_MACHINE(obj); + uint8_t mode = 0; + + if (!strncasecmp(str, "qspi", 4)) { + mode = 1; + } else if (!strncasecmp(str, "sd", 2)) { + mode = 5; + } else if (!strncasecmp(str, "nor", 3)) { + mode = 2; + } else if (!strncasecmp(str, "jtag", 4)) { + mode = 0; + } else { + error_setg(errp, "%s boot mode not supported", str); + return; + } + m->boot_mode = mode; +} + static void zynq_init(MachineState *machine) { ZynqMachineState *zynq_machine = ZYNQ_MACHINE(machine); @@ -241,6 +264,7 @@ static void zynq_init(MachineState *machine) /* Create slcr, keep a pointer to connect clocks */ slcr = qdev_new("xilinx-zynq_slcr"); qdev_connect_clock_in(slcr, "ps_clk", zynq_machine->ps_clk); + qdev_prop_set_uint8(slcr, "boot-mode", zynq_machine->boot_mode); sysbus_realize_and_unref(SYS_BUS_DEVICE(slcr), &error_fatal); sysbus_mmio_map(SYS_BUS_DEVICE(slcr), 0, 0xF8000000); @@ -373,6 +397,7 @@ static void zynq_machine_class_init(ObjectClass *oc, void *data) NULL }; MachineClass *mc = MACHINE_CLASS(oc); + ObjectProperty *prop; mc->desc = "Xilinx Zynq Platform Baseboard for Cortex-A9"; mc->init = zynq_init; mc->max_cpus = ZYNQ_MAX_CPUS; @@ -380,6 +405,12 @@ static void zynq_machine_class_init(ObjectClass *oc, void *data) mc->ignore_memory_transaction_failures = true; mc->valid_cpu_types = valid_cpu_types; mc->default_ram_id = "zynq.ext_ram"; + prop = object_class_property_add_str(oc, "boot-mode", NULL, + zynq_set_boot_mode); + object_class_property_set_description(oc, "boot-mode", + "Supported boot modes:" + " jtag qspi sd nor"); + object_property_set_default_str(prop, "qspi"); } static const TypeInfo zynq_machine_type = { diff --git a/hw/misc/bcm2835_property.c b/hw/misc/bcm2835_property.c index bdd9a6b..63de3db 100644 --- a/hw/misc/bcm2835_property.c +++ b/hw/misc/bcm2835_property.c @@ -32,6 +32,7 @@ static void bcm2835_property_mbox_push(BCM2835PropertyState *s, uint32_t value) uint32_t tmp; int n; uint32_t offset, length, color; + uint32_t start_num, number, otp_row; /* * Copy the current state of the framebuffer config; we will update @@ -322,6 +323,89 @@ static void bcm2835_property_mbox_push(BCM2835PropertyState *s, uint32_t value) 0); resplen = VCHI_BUSADDR_SIZE; break; + + /* Customer OTP */ + + case RPI_FWREQ_GET_CUSTOMER_OTP: + start_num = ldl_le_phys(&s->dma_as, value + 12); + number = ldl_le_phys(&s->dma_as, value + 16); + + resplen = 8 + 4 * number; + + for (n = start_num; n < start_num + number && + n < BCM2835_OTP_CUSTOMER_OTP_LEN; n++) { + otp_row = bcm2835_otp_get_row(s->otp, + BCM2835_OTP_CUSTOMER_OTP + n); + stl_le_phys(&s->dma_as, + value + 20 + ((n - start_num) << 2), otp_row); + } + break; + case RPI_FWREQ_SET_CUSTOMER_OTP: + start_num = ldl_le_phys(&s->dma_as, value + 12); + number = ldl_le_phys(&s->dma_as, value + 16); + + resplen = 4; + + /* Magic numbers to permanently lock customer OTP */ + if (start_num == BCM2835_OTP_LOCK_NUM1 && + number == BCM2835_OTP_LOCK_NUM2) { + bcm2835_otp_set_row(s->otp, + BCM2835_OTP_ROW_32, + BCM2835_OTP_ROW_32_LOCK); + break; + } + + /* If row 32 has the lock bit, don't allow further writes */ + if (bcm2835_otp_get_row(s->otp, BCM2835_OTP_ROW_32) & + BCM2835_OTP_ROW_32_LOCK) { + break; + } + + for (n = start_num; n < start_num + number && + n < BCM2835_OTP_CUSTOMER_OTP_LEN; n++) { + otp_row = ldl_le_phys(&s->dma_as, + value + 20 + ((n - start_num) << 2)); + bcm2835_otp_set_row(s->otp, + BCM2835_OTP_CUSTOMER_OTP + n, otp_row); + } + break; + + /* Device-specific private key */ + + case RPI_FWREQ_GET_PRIVATE_KEY: + start_num = ldl_le_phys(&s->dma_as, value + 12); + number = ldl_le_phys(&s->dma_as, value + 16); + + resplen = 8 + 4 * number; + + for (n = start_num; n < start_num + number && + n < BCM2835_OTP_PRIVATE_KEY_LEN; n++) { + otp_row = bcm2835_otp_get_row(s->otp, + BCM2835_OTP_PRIVATE_KEY + n); + stl_le_phys(&s->dma_as, + value + 20 + ((n - start_num) << 2), otp_row); + } + break; + case RPI_FWREQ_SET_PRIVATE_KEY: + start_num = ldl_le_phys(&s->dma_as, value + 12); + number = ldl_le_phys(&s->dma_as, value + 16); + + resplen = 4; + + /* If row 32 has the lock bit, don't allow further writes */ + if (bcm2835_otp_get_row(s->otp, BCM2835_OTP_ROW_32) & + BCM2835_OTP_ROW_32_LOCK) { + break; + } + + for (n = start_num; n < start_num + number && + n < BCM2835_OTP_PRIVATE_KEY_LEN; n++) { + otp_row = ldl_le_phys(&s->dma_as, + value + 20 + ((n - start_num) << 2)); + bcm2835_otp_set_row(s->otp, + BCM2835_OTP_PRIVATE_KEY + n, otp_row); + } + break; default: qemu_log_mask(LOG_UNIMP, "bcm2835_property: unhandled tag 0x%08x\n", tag); @@ -449,6 +533,9 @@ static void bcm2835_property_realize(DeviceState *dev, Error **errp) s->dma_mr = MEMORY_REGION(obj); address_space_init(&s->dma_as, s->dma_mr, TYPE_BCM2835_PROPERTY "-memory"); + obj = object_property_get_link(OBJECT(dev), "otp", &error_abort); + s->otp = BCM2835_OTP(obj); + /* TODO: connect to MAC address of USB NIC device, once we emulate it */ qemu_macaddr_default_if_unset(&s->macaddr); diff --git a/hw/misc/stm32l4x5_exti.c b/hw/misc/stm32l4x5_exti.c index 495a000..6a2ec62 100644 --- a/hw/misc/stm32l4x5_exti.c +++ b/hw/misc/stm32l4x5_exti.c @@ -88,6 +88,7 @@ static void stm32l4x5_exti_reset_hold(Object *obj, ResetType type) s->ftsr[bank] = 0x00000000; s->swier[bank] = 0x00000000; s->pr[bank] = 0x00000000; + s->irq_levels[bank] = 0x00000000; } } @@ -102,27 +103,23 @@ static void stm32l4x5_exti_set_irq(void *opaque, int irq, int level) /* Shift the value to enable access in x2 registers. */ irq %= EXTI_MAX_IRQ_PER_BANK; + if (level == extract32(s->irq_levels[bank], irq, 1)) { + /* No change in IRQ line state: do nothing */ + return; + } + s->irq_levels[bank] = deposit32(s->irq_levels[bank], irq, 1, level); + /* If the interrupt is masked, pr won't be raised */ if (!extract32(s->imr[bank], irq, 1)) { return; } - if (((1 << irq) & s->rtsr[bank]) && level) { - /* Rising Edge */ - s->pr[bank] |= 1 << irq; - qemu_irq_pulse(s->irq[oirq]); - } else if (((1 << irq) & s->ftsr[bank]) && !level) { - /* Falling Edge */ + if ((level && extract32(s->rtsr[bank], irq, 1)) || + (!level && extract32(s->ftsr[bank], irq, 1))) { + s->pr[bank] |= 1 << irq; qemu_irq_pulse(s->irq[oirq]); } - /* - * In the following situations : - * - falling edge but rising trigger selected - * - rising edge but falling trigger selected - * - no trigger selected - * No action is required - */ } static uint64_t stm32l4x5_exti_read(void *opaque, hwaddr addr, @@ -255,8 +252,8 @@ static void stm32l4x5_exti_init(Object *obj) static const VMStateDescription vmstate_stm32l4x5_exti = { .name = TYPE_STM32L4X5_EXTI, - .version_id = 1, - .minimum_version_id = 1, + .version_id = 2, + .minimum_version_id = 2, .fields = (VMStateField[]) { VMSTATE_UINT32_ARRAY(imr, Stm32l4x5ExtiState, EXTI_NUM_REGISTER), VMSTATE_UINT32_ARRAY(emr, Stm32l4x5ExtiState, EXTI_NUM_REGISTER), @@ -264,6 +261,7 @@ static const VMStateDescription vmstate_stm32l4x5_exti = { VMSTATE_UINT32_ARRAY(ftsr, Stm32l4x5ExtiState, EXTI_NUM_REGISTER), VMSTATE_UINT32_ARRAY(swier, Stm32l4x5ExtiState, EXTI_NUM_REGISTER), VMSTATE_UINT32_ARRAY(pr, Stm32l4x5ExtiState, EXTI_NUM_REGISTER), + VMSTATE_UINT32_ARRAY(irq_levels, Stm32l4x5ExtiState, EXTI_NUM_REGISTER), VMSTATE_END_OF_LIST() } }; diff --git a/hw/misc/zynq_slcr.c b/hw/misc/zynq_slcr.c index 3412ff0..ad814c3 100644 --- a/hw/misc/zynq_slcr.c +++ b/hw/misc/zynq_slcr.c @@ -24,6 +24,8 @@ #include "hw/registerfields.h" #include "hw/qdev-clock.h" #include "qom/object.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" #ifndef ZYNQ_SLCR_ERR_DEBUG #define ZYNQ_SLCR_ERR_DEBUG 0 @@ -121,6 +123,7 @@ REG32(RST_REASON, 0x250) REG32(REBOOT_STATUS, 0x258) REG32(BOOT_MODE, 0x25c) + FIELD(BOOT_MODE, BOOT_MODE, 0, 4) REG32(APU_CTRL, 0x300) REG32(WDT_CLK_SEL, 0x304) @@ -195,6 +198,7 @@ struct ZynqSLCRState { Clock *ps_clk; Clock *uart0_ref_clk; Clock *uart1_ref_clk; + uint8_t boot_mode; }; /* @@ -371,7 +375,7 @@ static void zynq_slcr_reset_init(Object *obj, ResetType type) s->regs[R_FPGA_RST_CTRL] = 0x01F33F0F; s->regs[R_RST_REASON] = 0x00000040; - s->regs[R_BOOT_MODE] = 0x00000001; + s->regs[R_BOOT_MODE] = s->boot_mode & R_BOOT_MODE_BOOT_MODE_MASK; /* 0x700 - 0x7D4 */ for (i = 0; i < 54; i++) { @@ -588,6 +592,15 @@ static const ClockPortInitArray zynq_slcr_clocks = { QDEV_CLOCK_END }; +static void zynq_slcr_realize(DeviceState *dev, Error **errp) +{ + ZynqSLCRState *s = ZYNQ_SLCR(dev); + + if (s->boot_mode > 0xF) { + error_setg(errp, "Invalid boot mode %d specified", s->boot_mode); + } +} + static void zynq_slcr_init(Object *obj) { ZynqSLCRState *s = ZYNQ_SLCR(obj); @@ -610,15 +623,22 @@ static const VMStateDescription vmstate_zynq_slcr = { } }; +static Property zynq_slcr_props[] = { + DEFINE_PROP_UINT8("boot-mode", ZynqSLCRState, boot_mode, 1), + DEFINE_PROP_END_OF_LIST(), +}; + static void zynq_slcr_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); ResettableClass *rc = RESETTABLE_CLASS(klass); dc->vmsd = &vmstate_zynq_slcr; + dc->realize = zynq_slcr_realize; rc->phases.enter = zynq_slcr_reset_init; rc->phases.hold = zynq_slcr_reset_hold; rc->phases.exit = zynq_slcr_reset_exit; + device_class_set_props(dc, zynq_slcr_props); } static const TypeInfo zynq_slcr_info = { diff --git a/hw/nvram/bcm2835_otp.c b/hw/nvram/bcm2835_otp.c new file mode 100644 index 0000000..c4aed28 --- /dev/null +++ b/hw/nvram/bcm2835_otp.c @@ -0,0 +1,187 @@ +/* + * BCM2835 One-Time Programmable (OTP) Memory + * + * The OTP implementation is mostly a stub except for the OTP rows + * which are accessed directly by other peripherals such as the mailbox. + * + * The OTP registers are unimplemented due to lack of documentation. + * + * Copyright (c) 2024 Rayhan Faizel <rayhan.faizel@gmail.com> + * + * SPDX-License-Identifier: MIT + */ + +#include "qemu/osdep.h" +#include "qemu/log.h" +#include "hw/nvram/bcm2835_otp.h" +#include "migration/vmstate.h" + +/* OTP rows are 1-indexed */ +uint32_t bcm2835_otp_get_row(BCM2835OTPState *s, unsigned int row) +{ + assert(row <= BCM2835_OTP_ROW_COUNT && row >= 1); + + return s->otp_rows[row - 1]; +} + +void bcm2835_otp_set_row(BCM2835OTPState *s, unsigned int row, + uint32_t value) +{ + assert(row <= BCM2835_OTP_ROW_COUNT && row >= 1); + + /* Real OTP rows work as e-fuses */ + s->otp_rows[row - 1] |= value; +} + +static uint64_t bcm2835_otp_read(void *opaque, hwaddr addr, unsigned size) +{ + switch (addr) { + case BCM2835_OTP_BOOTMODE_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_BOOTMODE_REG\n"); + break; + case BCM2835_OTP_CONFIG_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_CONFIG_REG\n"); + break; + case BCM2835_OTP_CTRL_LO_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_CTRL_LO_REG\n"); + break; + case BCM2835_OTP_CTRL_HI_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_CTRL_HI_REG\n"); + break; + case BCM2835_OTP_STATUS_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_STATUS_REG\n"); + break; + case BCM2835_OTP_BITSEL_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_BITSEL_REG\n"); + break; + case BCM2835_OTP_DATA_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_DATA_REG\n"); + break; + case BCM2835_OTP_ADDR_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_ADDR_REG\n"); + break; + case BCM2835_OTP_WRITE_DATA_READ_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_WRITE_DATA_READ_REG\n"); + break; + case BCM2835_OTP_INIT_STATUS_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_INIT_STATUS_REG\n"); + break; + default: + qemu_log_mask(LOG_GUEST_ERROR, + "%s: Bad offset 0x%" HWADDR_PRIx "\n", __func__, addr); + } + + return 0; +} + +static void bcm2835_otp_write(void *opaque, hwaddr addr, + uint64_t value, unsigned int size) +{ + switch (addr) { + case BCM2835_OTP_BOOTMODE_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_BOOTMODE_REG\n"); + break; + case BCM2835_OTP_CONFIG_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_CONFIG_REG\n"); + break; + case BCM2835_OTP_CTRL_LO_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_CTRL_LO_REG\n"); + break; + case BCM2835_OTP_CTRL_HI_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_CTRL_HI_REG\n"); + break; + case BCM2835_OTP_STATUS_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_STATUS_REG\n"); + break; + case BCM2835_OTP_BITSEL_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_BITSEL_REG\n"); + break; + case BCM2835_OTP_DATA_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_DATA_REG\n"); + break; + case BCM2835_OTP_ADDR_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_ADDR_REG\n"); + break; + case BCM2835_OTP_WRITE_DATA_READ_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_WRITE_DATA_READ_REG\n"); + break; + case BCM2835_OTP_INIT_STATUS_REG: + qemu_log_mask(LOG_UNIMP, + "bcm2835_otp: BCM2835_OTP_INIT_STATUS_REG\n"); + break; + default: + qemu_log_mask(LOG_GUEST_ERROR, + "%s: Bad offset 0x%" HWADDR_PRIx "\n", __func__, addr); + } +} + +static const MemoryRegionOps bcm2835_otp_ops = { + .read = bcm2835_otp_read, + .write = bcm2835_otp_write, + .endianness = DEVICE_NATIVE_ENDIAN, + .impl = { + .min_access_size = 4, + .max_access_size = 4, + }, +}; + +static void bcm2835_otp_realize(DeviceState *dev, Error **errp) +{ + BCM2835OTPState *s = BCM2835_OTP(dev); + memory_region_init_io(&s->iomem, OBJECT(dev), &bcm2835_otp_ops, s, + TYPE_BCM2835_OTP, 0x80); + sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->iomem); + + memset(s->otp_rows, 0x00, sizeof(s->otp_rows)); +} + +static const VMStateDescription vmstate_bcm2835_otp = { + .name = TYPE_BCM2835_OTP, + .version_id = 1, + .minimum_version_id = 1, + .fields = (const VMStateField[]) { + VMSTATE_UINT32_ARRAY(otp_rows, BCM2835OTPState, BCM2835_OTP_ROW_COUNT), + VMSTATE_END_OF_LIST() + } +}; + +static void bcm2835_otp_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + + dc->realize = bcm2835_otp_realize; + dc->vmsd = &vmstate_bcm2835_otp; +} + +static const TypeInfo bcm2835_otp_info = { + .name = TYPE_BCM2835_OTP, + .parent = TYPE_SYS_BUS_DEVICE, + .instance_size = sizeof(BCM2835OTPState), + .class_init = bcm2835_otp_class_init, +}; + +static void bcm2835_otp_register_types(void) +{ + type_register_static(&bcm2835_otp_info); +} + +type_init(bcm2835_otp_register_types) diff --git a/hw/nvram/meson.build b/hw/nvram/meson.build index 4996c72..10f3639 100644 --- a/hw/nvram/meson.build +++ b/hw/nvram/meson.build @@ -1,5 +1,6 @@ system_ss.add(files('fw_cfg-interface.c')) system_ss.add(files('fw_cfg.c')) +system_ss.add(when: 'CONFIG_RASPI', if_true: files('bcm2835_otp.c')) system_ss.add(when: 'CONFIG_CHRP_NVRAM', if_true: files('chrp_nvram.c')) system_ss.add(when: 'CONFIG_DS1225Y', if_true: files('ds1225y.c')) system_ss.add(when: 'CONFIG_NMC93XX_EEPROM', if_true: files('eeprom93xx.c')) diff --git a/include/hw/arm/bcm2835_peripherals.h b/include/hw/arm/bcm2835_peripherals.h index 636203b..1eeaeec 100644 --- a/include/hw/arm/bcm2835_peripherals.h +++ b/include/hw/arm/bcm2835_peripherals.h @@ -33,6 +33,7 @@ #include "hw/usb/hcd-dwc2.h" #include "hw/ssi/bcm2835_spi.h" #include "hw/i2c/bcm2835_i2c.h" +#include "hw/nvram/bcm2835_otp.h" #include "hw/misc/unimp.h" #include "qom/object.h" @@ -71,7 +72,7 @@ struct BCMSocPeripheralBaseState { BCM2835SPIState spi[1]; BCM2835I2CState i2c[3]; OrIRQState orgated_i2c_irq; - UnimplementedDeviceState otp; + BCM2835OTPState otp; UnimplementedDeviceState dbus; UnimplementedDeviceState ave0; UnimplementedDeviceState v3d; diff --git a/include/hw/arm/raspberrypi-fw-defs.h b/include/hw/arm/raspberrypi-fw-defs.h index 8b404e0..60b8e5b 100644 --- a/include/hw/arm/raspberrypi-fw-defs.h +++ b/include/hw/arm/raspberrypi-fw-defs.h @@ -56,6 +56,7 @@ enum rpi_firmware_property_tag { RPI_FWREQ_GET_THROTTLED = 0x00030046, RPI_FWREQ_GET_CLOCK_MEASURED = 0x00030047, RPI_FWREQ_NOTIFY_REBOOT = 0x00030048, + RPI_FWREQ_GET_PRIVATE_KEY = 0x00030081, RPI_FWREQ_SET_CLOCK_STATE = 0x00038001, RPI_FWREQ_SET_CLOCK_RATE = 0x00038002, RPI_FWREQ_SET_VOLTAGE = 0x00038003, @@ -73,6 +74,7 @@ enum rpi_firmware_property_tag { RPI_FWREQ_SET_PERIPH_REG = 0x00038045, RPI_FWREQ_GET_POE_HAT_VAL = 0x00030049, RPI_FWREQ_SET_POE_HAT_VAL = 0x00038049, + RPI_FWREQ_SET_PRIVATE_KEY = 0x00038081, RPI_FWREQ_SET_POE_HAT_VAL_OLD = 0x00030050, RPI_FWREQ_NOTIFY_XHCI_RESET = 0x00030058, RPI_FWREQ_GET_REBOOT_FLAGS = 0x00030064, diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h index 5ec2e6c..687b7ca 100644 --- a/include/hw/arm/smmu-common.h +++ b/include/hw/arm/smmu-common.h @@ -182,8 +182,8 @@ int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm, */ SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t iova); -/* Return the iommu mr associated to @sid, or NULL if none */ -IOMMUMemoryRegion *smmu_iommu_mr(SMMUState *s, uint32_t sid); +/* Return the SMMUDevice associated to @sid, or NULL if none */ +SMMUDevice *smmu_find_sdev(SMMUState *s, uint32_t sid); #define SMMU_IOTLB_MAX_SIZE 256 diff --git a/include/hw/misc/bcm2835_property.h b/include/hw/misc/bcm2835_property.h index ba88966..2f93fd0 100644 --- a/include/hw/misc/bcm2835_property.h +++ b/include/hw/misc/bcm2835_property.h @@ -11,6 +11,7 @@ #include "hw/sysbus.h" #include "net/net.h" #include "hw/display/bcm2835_fb.h" +#include "hw/nvram/bcm2835_otp.h" #include "qom/object.h" #define TYPE_BCM2835_PROPERTY "bcm2835-property" @@ -26,6 +27,7 @@ struct BCM2835PropertyState { MemoryRegion iomem; qemu_irq mbox_irq; BCM2835FBState *fbdev; + BCM2835OTPState *otp; MACAddr macaddr; uint32_t board_rev; diff --git a/include/hw/misc/stm32l4x5_exti.h b/include/hw/misc/stm32l4x5_exti.h index be961d2..55f763f 100644 --- a/include/hw/misc/stm32l4x5_exti.h +++ b/include/hw/misc/stm32l4x5_exti.h @@ -45,6 +45,8 @@ struct Stm32l4x5ExtiState { uint32_t swier[EXTI_NUM_REGISTER]; uint32_t pr[EXTI_NUM_REGISTER]; + /* used for edge detection */ + uint32_t irq_levels[EXTI_NUM_REGISTER]; qemu_irq irq[EXTI_NUM_INTERRUPT_OUT_LINES]; }; diff --git a/include/hw/nvram/bcm2835_otp.h b/include/hw/nvram/bcm2835_otp.h new file mode 100644 index 0000000..1df3370 --- /dev/null +++ b/include/hw/nvram/bcm2835_otp.h @@ -0,0 +1,68 @@ +/* + * BCM2835 One-Time Programmable (OTP) Memory + * + * Copyright (c) 2024 Rayhan Faizel <rayhan.faizel@gmail.com> + * + * SPDX-License-Identifier: MIT + */ + +#ifndef BCM2835_OTP_H +#define BCM2835_OTP_H + +#include "hw/sysbus.h" +#include "qom/object.h" + +#define TYPE_BCM2835_OTP "bcm2835-otp" +OBJECT_DECLARE_SIMPLE_TYPE(BCM2835OTPState, BCM2835_OTP) + +#define BCM2835_OTP_ROW_COUNT 66 + +/* https://elinux.org/BCM2835_registers#OTP */ +#define BCM2835_OTP_BOOTMODE_REG 0x00 +#define BCM2835_OTP_CONFIG_REG 0x04 +#define BCM2835_OTP_CTRL_LO_REG 0x08 +#define BCM2835_OTP_CTRL_HI_REG 0x0c +#define BCM2835_OTP_STATUS_REG 0x10 +#define BCM2835_OTP_BITSEL_REG 0x14 +#define BCM2835_OTP_DATA_REG 0x18 +#define BCM2835_OTP_ADDR_REG 0x1c +#define BCM2835_OTP_WRITE_DATA_READ_REG 0x20 +#define BCM2835_OTP_INIT_STATUS_REG 0x24 + + +/* -- Row 32: Undocumented -- */ + +#define BCM2835_OTP_ROW_32 32 + +/* Lock OTP Programming (Customer OTP and private key) */ +#define BCM2835_OTP_ROW_32_LOCK BIT(6) + +/* -- Row 36-43: Customer OTP -- */ + +#define BCM2835_OTP_CUSTOMER_OTP 36 +#define BCM2835_OTP_CUSTOMER_OTP_LEN 8 + +/* Magic numbers to lock programming of customer OTP and private key */ +#define BCM2835_OTP_LOCK_NUM1 0xffffffff +#define BCM2835_OTP_LOCK_NUM2 0xaffe0000 + +/* -- Row 56-63: Device-specific private key -- */ + +#define BCM2835_OTP_PRIVATE_KEY 56 +#define BCM2835_OTP_PRIVATE_KEY_LEN 8 + + +struct BCM2835OTPState { + /* <private> */ + SysBusDevice parent_obj; + + /* <public> */ + MemoryRegion iomem; + uint32_t otp_rows[BCM2835_OTP_ROW_COUNT]; +}; + + +uint32_t bcm2835_otp_get_row(BCM2835OTPState *s, unsigned int row); +void bcm2835_otp_set_row(BCM2835OTPState *s, unsigned int row, uint32_t value); + +#endif diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 3841359..d8eb986 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -2299,6 +2299,8 @@ FIELD(DBGDEVID, DOUBLELOCK, 20, 4) FIELD(DBGDEVID, AUXREGS, 24, 4) FIELD(DBGDEVID, CIDMASK, 28, 4) +FIELD(DBGDEVID1, PCSROFFSET, 0, 4) + FIELD(MVFR0, SIMDREG, 0, 4) FIELD(MVFR0, FPSP, 4, 4) FIELD(MVFR0, FPDP, 8, 4) diff --git a/target/arm/helper.h b/target/arm/helper.h index eca2043..970d059 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -979,6 +979,16 @@ DEF_HELPER_FLAGS_5(neon_sqrdmulh_idx_h, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(neon_sqrdmulh_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqrdmlah_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqrdmlah_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(neon_sqrdmlsh_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(neon_sqrdmlsh_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve2_sqdmulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqdmulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode index 2b7a325..223eac3 100644 --- a/target/arm/tcg/a64.decode +++ b/target/arm/tcg/a64.decode @@ -61,6 +61,7 @@ @qrrr_b . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=0 @qrrr_h . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=1 +@qrrr_s . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=2 @qrrr_sd . q:1 ...... ... rm:5 ...... rn:5 rd:5 &qrrr_e esz=%esz_sd @qrrr_e . q:1 ...... esz:2 . rm:5 ...... rn:5 rd:5 &qrrr_e @qr2r_e . q:1 ...... esz:2 . ..... ...... rm:5 rd:5 &qrrr_e rn=%rd @@ -781,6 +782,8 @@ CMEQ_s 0111 1110 111 ..... 10001 1 ..... ..... @rrr_d SQDMULH_s 0101 1110 ..1 ..... 10110 1 ..... ..... @rrr_e SQRDMULH_s 0111 1110 ..1 ..... 10110 1 ..... ..... @rrr_e +SQRDMLAH_s 0111 1110 ..0 ..... 10000 1 ..... ..... @rrr_e +SQRDMLSH_s 0111 1110 ..0 ..... 10001 1 ..... ..... @rrr_e ### Advanced SIMD scalar pairwise @@ -941,6 +944,23 @@ MLS_v 0.10 1110 ..1 ..... 10010 1 ..... ..... @qrrr_e SQDMULH_v 0.00 1110 ..1 ..... 10110 1 ..... ..... @qrrr_e SQRDMULH_v 0.10 1110 ..1 ..... 10110 1 ..... ..... @qrrr_e +SQRDMLAH_v 0.10 1110 ..0 ..... 10000 1 ..... ..... @qrrr_e +SQRDMLSH_v 0.10 1110 ..0 ..... 10001 1 ..... ..... @qrrr_e + +SDOT_v 0.00 1110 100 ..... 10010 1 ..... ..... @qrrr_s +UDOT_v 0.10 1110 100 ..... 10010 1 ..... ..... @qrrr_s +USDOT_v 0.00 1110 100 ..... 10011 1 ..... ..... @qrrr_s +BFDOT_v 0.10 1110 010 ..... 11111 1 ..... ..... @qrrr_s +BFMLAL_v 0.10 1110 110 ..... 11111 1 ..... ..... @qrrr_h +BFMMLA 0110 1110 010 ..... 11101 1 ..... ..... @rrr_q1e0 +SMMLA 0100 1110 100 ..... 10100 1 ..... ..... @rrr_q1e0 +UMMLA 0110 1110 100 ..... 10100 1 ..... ..... @rrr_q1e0 +USMMLA 0100 1110 100 ..... 10101 1 ..... ..... @rrr_q1e0 + +FCADD_90 0.10 1110 ..0 ..... 11100 1 ..... ..... @qrrr_e +FCADD_270 0.10 1110 ..0 ..... 11110 1 ..... ..... @qrrr_e + +FCMLA_v 0 q:1 10 1110 esz:2 0 rm:5 110 rot:2 1 rn:5 rd:5 ### Advanced SIMD scalar x indexed element @@ -966,6 +986,12 @@ SQDMULH_si 0101 1111 10 .. .... 1100 . 0 ..... ..... @rrx_s SQRDMULH_si 0101 1111 01 .. .... 1101 . 0 ..... ..... @rrx_h SQRDMULH_si 0101 1111 10 . ..... 1101 . 0 ..... ..... @rrx_s +SQRDMLAH_si 0111 1111 01 .. .... 1101 . 0 ..... ..... @rrx_h +SQRDMLAH_si 0111 1111 10 .. .... 1101 . 0 ..... ..... @rrx_s + +SQRDMLSH_si 0111 1111 01 .. .... 1111 . 0 ..... ..... @rrx_h +SQRDMLSH_si 0111 1111 10 .. .... 1111 . 0 ..... ..... @rrx_s + ### Advanced SIMD vector x indexed element FMUL_vi 0.00 1111 00 .. .... 1001 . 0 ..... ..... @qrrx_h @@ -1004,6 +1030,23 @@ SQDMULH_vi 0.00 1111 10 . ..... 1100 . 0 ..... ..... @qrrx_s SQRDMULH_vi 0.00 1111 01 .. .... 1101 . 0 ..... ..... @qrrx_h SQRDMULH_vi 0.00 1111 10 . ..... 1101 . 0 ..... ..... @qrrx_s +SQRDMLAH_vi 0.10 1111 01 .. .... 1101 . 0 ..... ..... @qrrx_h +SQRDMLAH_vi 0.10 1111 10 .. .... 1101 . 0 ..... ..... @qrrx_s + +SQRDMLSH_vi 0.10 1111 01 .. .... 1111 . 0 ..... ..... @qrrx_h +SQRDMLSH_vi 0.10 1111 10 .. .... 1111 . 0 ..... ..... @qrrx_s + +SDOT_vi 0.00 1111 10 .. .... 1110 . 0 ..... ..... @qrrx_s +UDOT_vi 0.10 1111 10 .. .... 1110 . 0 ..... ..... @qrrx_s +SUDOT_vi 0.00 1111 00 .. .... 1111 . 0 ..... ..... @qrrx_s +USDOT_vi 0.00 1111 10 .. .... 1111 . 0 ..... ..... @qrrx_s +BFDOT_vi 0.00 1111 01 .. .... 1111 . 0 ..... ..... @qrrx_s +BFMLAL_vi 0.00 1111 11 .. .... 1111 . 0 ..... ..... @qrrx_h + +FCMLA_vi 0 0 10 1111 01 idx:1 rm:5 0 rot:2 1 0 0 rn:5 rd:5 esz=1 q=0 +FCMLA_vi 0 1 10 1111 01 . rm:5 0 rot:2 1 . 0 rn:5 rd:5 esz=1 idx=%hl q=1 +FCMLA_vi 0 1 10 1111 10 0 rm:5 0 rot:2 1 idx:1 0 rn:5 rd:5 esz=2 q=1 + # Floating-point conditional select FCSEL 0001 1110 .. 1 rm:5 cond:4 11 rn:5 rd:5 esz=%esz_hsd diff --git a/target/arm/tcg/cpu32.c b/target/arm/tcg/cpu32.c index bdd82d9..20c2737 100644 --- a/target/arm/tcg/cpu32.c +++ b/target/arm/tcg/cpu32.c @@ -82,11 +82,39 @@ void aa32_max_features(ARMCPU *cpu) cpu->isar.id_pfr2 = t; t = cpu->isar.id_dfr0; - t = FIELD_DP32(t, ID_DFR0, COPDBG, 9); /* FEAT_Debugv8p4 */ - t = FIELD_DP32(t, ID_DFR0, COPSDBG, 9); /* FEAT_Debugv8p4 */ + t = FIELD_DP32(t, ID_DFR0, COPDBG, 10); /* FEAT_Debugv8p8 */ + t = FIELD_DP32(t, ID_DFR0, COPSDBG, 10); /* FEAT_Debugv8p8 */ t = FIELD_DP32(t, ID_DFR0, PERFMON, 6); /* FEAT_PMUv3p5 */ cpu->isar.id_dfr0 = t; + /* Debug ID registers. */ + + /* Bit[15] is RES1, Bit[13] and Bits[11:0] are RES0. */ + t = 0x00008000; + t = FIELD_DP32(t, DBGDIDR, SE_IMP, 1); + t = FIELD_DP32(t, DBGDIDR, NSUHD_IMP, 1); + t = FIELD_DP32(t, DBGDIDR, VERSION, 10); /* FEAT_Debugv8p8 */ + t = FIELD_DP32(t, DBGDIDR, CTX_CMPS, 1); + t = FIELD_DP32(t, DBGDIDR, BRPS, 5); + t = FIELD_DP32(t, DBGDIDR, WRPS, 3); + cpu->isar.dbgdidr = t; + + t = 0; + t = FIELD_DP32(t, DBGDEVID, PCSAMPLE, 3); + t = FIELD_DP32(t, DBGDEVID, WPADDRMASK, 1); + t = FIELD_DP32(t, DBGDEVID, BPADDRMASK, 15); + t = FIELD_DP32(t, DBGDEVID, VECTORCATCH, 0); + t = FIELD_DP32(t, DBGDEVID, VIRTEXTNS, 1); + t = FIELD_DP32(t, DBGDEVID, DOUBLELOCK, 1); + t = FIELD_DP32(t, DBGDEVID, AUXREGS, 0); + t = FIELD_DP32(t, DBGDEVID, CIDMASK, 0); + cpu->isar.dbgdevid = t; + + /* Bits[31:4] are RES0. */ + t = 0; + t = FIELD_DP32(t, DBGDEVID1, PCSROFFSET, 2); + cpu->isar.dbgdevid1 = t; + t = cpu->isar.id_dfr1; t = FIELD_DP32(t, ID_DFR1, HPMN0, 1); /* FEAT_HPMN0 */ cpu->isar.id_dfr1 = t; @@ -955,9 +983,6 @@ static void arm_max_initfn(Object *obj) cpu->isar.id_isar4 = 0x00011142; cpu->isar.id_isar5 = 0x00011121; cpu->isar.id_isar6 = 0; - cpu->isar.dbgdidr = 0x3516d000; - cpu->isar.dbgdevid = 0x00110f13; - cpu->isar.dbgdevid1 = 0x2; cpu->isar.reset_pmcr_el0 = 0x41013000; cpu->clidr = 0x0a200023; cpu->ccsidr[0] = 0x701fe00a; /* 32KB L1 dcache */ diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c index 0899251..fe232eb 100644 --- a/target/arm/tcg/cpu64.c +++ b/target/arm/tcg/cpu64.c @@ -1167,7 +1167,7 @@ void aarch64_max_tcg_initfn(Object *obj) t = cpu->isar.id_aa64isar2; t = FIELD_DP64(t, ID_AA64ISAR2, MOPS, 1); /* FEAT_MOPS */ - t = FIELD_DP64(t, ID_AA64ISAR2, BC, 1); /* FEAT_HBC */ + t = FIELD_DP64(t, ID_AA64ISAR2, BC, 1); /* FEAT_HBC */ t = FIELD_DP64(t, ID_AA64ISAR2, WFXT, 2); /* FEAT_WFxT */ cpu->isar.id_aa64isar2 = t; @@ -1253,7 +1253,7 @@ void aarch64_max_tcg_initfn(Object *obj) cpu->isar.id_aa64zfr0 = t; t = cpu->isar.id_aa64dfr0; - t = FIELD_DP64(t, ID_AA64DFR0, DEBUGVER, 9); /* FEAT_Debugv8p4 */ + t = FIELD_DP64(t, ID_AA64DFR0, DEBUGVER, 10); /* FEAT_Debugv8p8 */ t = FIELD_DP64(t, ID_AA64DFR0, PMUVER, 6); /* FEAT_PMUv3p5 */ t = FIELD_DP64(t, ID_AA64DFR0, HPMN0, 1); /* FEAT_HPMN0 */ cpu->isar.id_aa64dfr0 = t; diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 93543da..6c07aea 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5235,6 +5235,43 @@ static const ENVScalar2 f_scalar_sqrdmulh = { }; TRANS(SQRDMULH_s, do_env_scalar2_hs, a, &f_scalar_sqrdmulh) +typedef struct ENVScalar3 { + NeonGenThreeOpEnvFn *gen_hs[2]; +} ENVScalar3; + +static bool do_env_scalar3_hs(DisasContext *s, arg_rrr_e *a, + const ENVScalar3 *f) +{ + TCGv_i32 t0, t1, t2; + + if (a->esz != MO_16 && a->esz != MO_32) { + return false; + } + if (!fp_access_check(s)) { + return true; + } + + t0 = tcg_temp_new_i32(); + t1 = tcg_temp_new_i32(); + t2 = tcg_temp_new_i32(); + read_vec_element_i32(s, t0, a->rn, 0, a->esz); + read_vec_element_i32(s, t1, a->rm, 0, a->esz); + read_vec_element_i32(s, t2, a->rd, 0, a->esz); + f->gen_hs[a->esz - 1](t0, tcg_env, t0, t1, t2); + write_fp_sreg(s, a->rd, t0); + return true; +} + +static const ENVScalar3 f_scalar_sqrdmlah = { + { gen_helper_neon_qrdmlah_s16, gen_helper_neon_qrdmlah_s32 } +}; +TRANS_FEAT(SQRDMLAH_s, aa64_rdm, do_env_scalar3_hs, a, &f_scalar_sqrdmlah) + +static const ENVScalar3 f_scalar_sqrdmlsh = { + { gen_helper_neon_qrdmlsh_s16, gen_helper_neon_qrdmlsh_s32 } +}; +TRANS_FEAT(SQRDMLSH_s, aa64_rdm, do_env_scalar3_hs, a, &f_scalar_sqrdmlsh) + static bool do_cmop_d(DisasContext *s, arg_rrr_e *a, TCGCond cond) { if (fp_access_check(s)) { @@ -5253,7 +5290,7 @@ TRANS(CMHS_s, do_cmop_d, a, TCG_COND_GEU) TRANS(CMEQ_s, do_cmop_d, a, TCG_COND_EQ) TRANS(CMTST_s, do_cmop_d, a, TCG_COND_TSTNE) -static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, +static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, int data, gen_helper_gvec_3_ptr * const fns[3]) { MemOp esz = a->esz; @@ -5276,7 +5313,7 @@ static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a, } if (fp_access_check(s)) { gen_gvec_op3_fpst(s, a->q, a->rd, a->rn, a->rm, - esz == MO_16, 0, fns[esz - 1]); + esz == MO_16, data, fns[esz - 1]); } return true; } @@ -5286,168 +5323,168 @@ static gen_helper_gvec_3_ptr * const f_vector_fadd[3] = { gen_helper_gvec_fadd_s, gen_helper_gvec_fadd_d, }; -TRANS(FADD_v, do_fp3_vector, a, f_vector_fadd) +TRANS(FADD_v, do_fp3_vector, a, 0, f_vector_fadd) static gen_helper_gvec_3_ptr * const f_vector_fsub[3] = { gen_helper_gvec_fsub_h, gen_helper_gvec_fsub_s, gen_helper_gvec_fsub_d, }; -TRANS(FSUB_v, do_fp3_vector, a, f_vector_fsub) +TRANS(FSUB_v, do_fp3_vector, a, 0, f_vector_fsub) static gen_helper_gvec_3_ptr * const f_vector_fdiv[3] = { gen_helper_gvec_fdiv_h, gen_helper_gvec_fdiv_s, gen_helper_gvec_fdiv_d, }; -TRANS(FDIV_v, do_fp3_vector, a, f_vector_fdiv) +TRANS(FDIV_v, do_fp3_vector, a, 0, f_vector_fdiv) static gen_helper_gvec_3_ptr * const f_vector_fmul[3] = { gen_helper_gvec_fmul_h, gen_helper_gvec_fmul_s, gen_helper_gvec_fmul_d, }; -TRANS(FMUL_v, do_fp3_vector, a, f_vector_fmul) +TRANS(FMUL_v, do_fp3_vector, a, 0, f_vector_fmul) static gen_helper_gvec_3_ptr * const f_vector_fmax[3] = { gen_helper_gvec_fmax_h, gen_helper_gvec_fmax_s, gen_helper_gvec_fmax_d, }; -TRANS(FMAX_v, do_fp3_vector, a, f_vector_fmax) +TRANS(FMAX_v, do_fp3_vector, a, 0, f_vector_fmax) static gen_helper_gvec_3_ptr * const f_vector_fmin[3] = { gen_helper_gvec_fmin_h, gen_helper_gvec_fmin_s, gen_helper_gvec_fmin_d, }; -TRANS(FMIN_v, do_fp3_vector, a, f_vector_fmin) +TRANS(FMIN_v, do_fp3_vector, a, 0, f_vector_fmin) static gen_helper_gvec_3_ptr * const f_vector_fmaxnm[3] = { gen_helper_gvec_fmaxnum_h, gen_helper_gvec_fmaxnum_s, gen_helper_gvec_fmaxnum_d, }; -TRANS(FMAXNM_v, do_fp3_vector, a, f_vector_fmaxnm) +TRANS(FMAXNM_v, do_fp3_vector, a, 0, f_vector_fmaxnm) static gen_helper_gvec_3_ptr * const f_vector_fminnm[3] = { gen_helper_gvec_fminnum_h, gen_helper_gvec_fminnum_s, gen_helper_gvec_fminnum_d, }; -TRANS(FMINNM_v, do_fp3_vector, a, f_vector_fminnm) +TRANS(FMINNM_v, do_fp3_vector, a, 0, f_vector_fminnm) static gen_helper_gvec_3_ptr * const f_vector_fmulx[3] = { gen_helper_gvec_fmulx_h, gen_helper_gvec_fmulx_s, gen_helper_gvec_fmulx_d, }; -TRANS(FMULX_v, do_fp3_vector, a, f_vector_fmulx) +TRANS(FMULX_v, do_fp3_vector, a, 0, f_vector_fmulx) static gen_helper_gvec_3_ptr * const f_vector_fmla[3] = { gen_helper_gvec_vfma_h, gen_helper_gvec_vfma_s, gen_helper_gvec_vfma_d, }; -TRANS(FMLA_v, do_fp3_vector, a, f_vector_fmla) +TRANS(FMLA_v, do_fp3_vector, a, 0, f_vector_fmla) static gen_helper_gvec_3_ptr * const f_vector_fmls[3] = { gen_helper_gvec_vfms_h, gen_helper_gvec_vfms_s, gen_helper_gvec_vfms_d, }; -TRANS(FMLS_v, do_fp3_vector, a, f_vector_fmls) +TRANS(FMLS_v, do_fp3_vector, a, 0, f_vector_fmls) static gen_helper_gvec_3_ptr * const f_vector_fcmeq[3] = { gen_helper_gvec_fceq_h, gen_helper_gvec_fceq_s, gen_helper_gvec_fceq_d, }; -TRANS(FCMEQ_v, do_fp3_vector, a, f_vector_fcmeq) +TRANS(FCMEQ_v, do_fp3_vector, a, 0, f_vector_fcmeq) static gen_helper_gvec_3_ptr * const f_vector_fcmge[3] = { gen_helper_gvec_fcge_h, gen_helper_gvec_fcge_s, gen_helper_gvec_fcge_d, }; -TRANS(FCMGE_v, do_fp3_vector, a, f_vector_fcmge) +TRANS(FCMGE_v, do_fp3_vector, a, 0, f_vector_fcmge) static gen_helper_gvec_3_ptr * const f_vector_fcmgt[3] = { gen_helper_gvec_fcgt_h, gen_helper_gvec_fcgt_s, gen_helper_gvec_fcgt_d, }; -TRANS(FCMGT_v, do_fp3_vector, a, f_vector_fcmgt) +TRANS(FCMGT_v, do_fp3_vector, a, 0, f_vector_fcmgt) static gen_helper_gvec_3_ptr * const f_vector_facge[3] = { gen_helper_gvec_facge_h, gen_helper_gvec_facge_s, gen_helper_gvec_facge_d, }; -TRANS(FACGE_v, do_fp3_vector, a, f_vector_facge) +TRANS(FACGE_v, do_fp3_vector, a, 0, f_vector_facge) static gen_helper_gvec_3_ptr * const f_vector_facgt[3] = { gen_helper_gvec_facgt_h, gen_helper_gvec_facgt_s, gen_helper_gvec_facgt_d, }; -TRANS(FACGT_v, do_fp3_vector, a, f_vector_facgt) +TRANS(FACGT_v, do_fp3_vector, a, 0, f_vector_facgt) static gen_helper_gvec_3_ptr * const f_vector_fabd[3] = { gen_helper_gvec_fabd_h, gen_helper_gvec_fabd_s, gen_helper_gvec_fabd_d, }; -TRANS(FABD_v, do_fp3_vector, a, f_vector_fabd) +TRANS(FABD_v, do_fp3_vector, a, 0, f_vector_fabd) static gen_helper_gvec_3_ptr * const f_vector_frecps[3] = { gen_helper_gvec_recps_h, gen_helper_gvec_recps_s, gen_helper_gvec_recps_d, }; -TRANS(FRECPS_v, do_fp3_vector, a, f_vector_frecps) +TRANS(FRECPS_v, do_fp3_vector, a, 0, f_vector_frecps) static gen_helper_gvec_3_ptr * const f_vector_frsqrts[3] = { gen_helper_gvec_rsqrts_h, gen_helper_gvec_rsqrts_s, gen_helper_gvec_rsqrts_d, }; -TRANS(FRSQRTS_v, do_fp3_vector, a, f_vector_frsqrts) +TRANS(FRSQRTS_v, do_fp3_vector, a, 0, f_vector_frsqrts) static gen_helper_gvec_3_ptr * const f_vector_faddp[3] = { gen_helper_gvec_faddp_h, gen_helper_gvec_faddp_s, gen_helper_gvec_faddp_d, }; -TRANS(FADDP_v, do_fp3_vector, a, f_vector_faddp) +TRANS(FADDP_v, do_fp3_vector, a, 0, f_vector_faddp) static gen_helper_gvec_3_ptr * const f_vector_fmaxp[3] = { gen_helper_gvec_fmaxp_h, gen_helper_gvec_fmaxp_s, gen_helper_gvec_fmaxp_d, }; -TRANS(FMAXP_v, do_fp3_vector, a, f_vector_fmaxp) +TRANS(FMAXP_v, do_fp3_vector, a, 0, f_vector_fmaxp) static gen_helper_gvec_3_ptr * const f_vector_fminp[3] = { gen_helper_gvec_fminp_h, gen_helper_gvec_fminp_s, gen_helper_gvec_fminp_d, }; -TRANS(FMINP_v, do_fp3_vector, a, f_vector_fminp) +TRANS(FMINP_v, do_fp3_vector, a, 0, f_vector_fminp) static gen_helper_gvec_3_ptr * const f_vector_fmaxnmp[3] = { gen_helper_gvec_fmaxnump_h, gen_helper_gvec_fmaxnump_s, gen_helper_gvec_fmaxnump_d, }; -TRANS(FMAXNMP_v, do_fp3_vector, a, f_vector_fmaxnmp) +TRANS(FMAXNMP_v, do_fp3_vector, a, 0, f_vector_fmaxnmp) static gen_helper_gvec_3_ptr * const f_vector_fminnmp[3] = { gen_helper_gvec_fminnump_h, gen_helper_gvec_fminnump_s, gen_helper_gvec_fminnump_d, }; -TRANS(FMINNMP_v, do_fp3_vector, a, f_vector_fminnmp) +TRANS(FMINNMP_v, do_fp3_vector, a, 0, f_vector_fminnmp) static bool do_fmlal(DisasContext *s, arg_qrrr_e *a, bool is_s, bool is_2) { @@ -5552,6 +5589,80 @@ TRANS(CMTST_v, do_gvec_fn3, a, gen_gvec_cmtst) TRANS(SQDMULH_v, do_gvec_fn3_no8_no64, a, gen_gvec_sqdmulh_qc) TRANS(SQRDMULH_v, do_gvec_fn3_no8_no64, a, gen_gvec_sqrdmulh_qc) +TRANS_FEAT(SQRDMLAH_v, aa64_rdm, do_gvec_fn3_no8_no64, a, gen_gvec_sqrdmlah_qc) +TRANS_FEAT(SQRDMLSH_v, aa64_rdm, do_gvec_fn3_no8_no64, a, gen_gvec_sqrdmlsh_qc) + +static bool do_dot_vector(DisasContext *s, arg_qrrr_e *a, + gen_helper_gvec_4 *fn) +{ + if (fp_access_check(s)) { + gen_gvec_op4_ool(s, a->q, a->rd, a->rn, a->rm, a->rd, 0, fn); + } + return true; +} + +TRANS_FEAT(SDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_sdot_b) +TRANS_FEAT(UDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_udot_b) +TRANS_FEAT(USDOT_v, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_usdot_b) +TRANS_FEAT(BFDOT_v, aa64_bf16, do_dot_vector, a, gen_helper_gvec_bfdot) +TRANS_FEAT(BFMMLA, aa64_bf16, do_dot_vector, a, gen_helper_gvec_bfmmla) +TRANS_FEAT(SMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_smmla_b) +TRANS_FEAT(UMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_ummla_b) +TRANS_FEAT(USMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_usmmla_b) + +static bool trans_BFMLAL_v(DisasContext *s, arg_qrrr_e *a) +{ + if (!dc_isar_feature(aa64_bf16, s)) { + return false; + } + if (fp_access_check(s)) { + /* Q bit selects BFMLALB vs BFMLALT. */ + gen_gvec_op4_fpst(s, true, a->rd, a->rn, a->rm, a->rd, false, a->q, + gen_helper_gvec_bfmlal); + } + return true; +} + +static gen_helper_gvec_3_ptr * const f_vector_fcadd[3] = { + gen_helper_gvec_fcaddh, + gen_helper_gvec_fcadds, + gen_helper_gvec_fcaddd, +}; +TRANS_FEAT(FCADD_90, aa64_fcma, do_fp3_vector, a, 0, f_vector_fcadd) +TRANS_FEAT(FCADD_270, aa64_fcma, do_fp3_vector, a, 1, f_vector_fcadd) + +static bool trans_FCMLA_v(DisasContext *s, arg_FCMLA_v *a) +{ + gen_helper_gvec_4_ptr *fn; + + if (!dc_isar_feature(aa64_fcma, s)) { + return false; + } + switch (a->esz) { + case MO_64: + if (!a->q) { + return false; + } + fn = gen_helper_gvec_fcmlad; + break; + case MO_32: + fn = gen_helper_gvec_fcmlas; + break; + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + fn = gen_helper_gvec_fcmlah; + break; + default: + return false; + } + if (fp_access_check(s)) { + gen_gvec_op4_fpst(s, a->q, a->rd, a->rn, a->rm, a->rd, + a->esz == MO_16, a->rot, fn); + } + return true; +} /* * Advanced SIMD scalar/vector x indexed element @@ -5681,6 +5792,29 @@ static bool do_env_scalar2_idx_hs(DisasContext *s, arg_rrx_e *a, TRANS(SQDMULH_si, do_env_scalar2_idx_hs, a, &f_scalar_sqdmulh) TRANS(SQRDMULH_si, do_env_scalar2_idx_hs, a, &f_scalar_sqrdmulh) +static bool do_env_scalar3_idx_hs(DisasContext *s, arg_rrx_e *a, + const ENVScalar3 *f) +{ + if (a->esz < MO_16 || a->esz > MO_32) { + return false; + } + if (fp_access_check(s)) { + TCGv_i32 t0 = tcg_temp_new_i32(); + TCGv_i32 t1 = tcg_temp_new_i32(); + TCGv_i32 t2 = tcg_temp_new_i32(); + + read_vec_element_i32(s, t0, a->rn, 0, a->esz); + read_vec_element_i32(s, t1, a->rm, a->idx, a->esz); + read_vec_element_i32(s, t2, a->rd, 0, a->esz); + f->gen_hs[a->esz - 1](t0, tcg_env, t0, t1, t2); + write_fp_sreg(s, a->rd, t0); + } + return true; +} + +TRANS_FEAT(SQRDMLAH_si, aa64_rdm, do_env_scalar3_idx_hs, a, &f_scalar_sqrdmlah) +TRANS_FEAT(SQRDMLSH_si, aa64_rdm, do_env_scalar3_idx_hs, a, &f_scalar_sqrdmlsh) + static bool do_fp3_vector_idx(DisasContext *s, arg_qrrx_e *a, gen_helper_gvec_3_ptr * const fns[3]) { @@ -5838,6 +5972,79 @@ static gen_helper_gvec_4 * const f_vector_idx_sqrdmulh[2] = { }; TRANS(SQRDMULH_vi, do_int3_qc_vector_idx, a, f_vector_idx_sqrdmulh) +static gen_helper_gvec_4 * const f_vector_idx_sqrdmlah[2] = { + gen_helper_neon_sqrdmlah_idx_h, + gen_helper_neon_sqrdmlah_idx_s, +}; +TRANS_FEAT(SQRDMLAH_vi, aa64_rdm, do_int3_qc_vector_idx, a, + f_vector_idx_sqrdmlah) + +static gen_helper_gvec_4 * const f_vector_idx_sqrdmlsh[2] = { + gen_helper_neon_sqrdmlsh_idx_h, + gen_helper_neon_sqrdmlsh_idx_s, +}; +TRANS_FEAT(SQRDMLSH_vi, aa64_rdm, do_int3_qc_vector_idx, a, + f_vector_idx_sqrdmlsh) + +static bool do_dot_vector_idx(DisasContext *s, arg_qrrx_e *a, + gen_helper_gvec_4 *fn) +{ + if (fp_access_check(s)) { + gen_gvec_op4_ool(s, a->q, a->rd, a->rn, a->rm, a->rd, a->idx, fn); + } + return true; +} + +TRANS_FEAT(SDOT_vi, aa64_dp, do_dot_vector_idx, a, gen_helper_gvec_sdot_idx_b) +TRANS_FEAT(UDOT_vi, aa64_dp, do_dot_vector_idx, a, gen_helper_gvec_udot_idx_b) +TRANS_FEAT(SUDOT_vi, aa64_i8mm, do_dot_vector_idx, a, + gen_helper_gvec_sudot_idx_b) +TRANS_FEAT(USDOT_vi, aa64_i8mm, do_dot_vector_idx, a, + gen_helper_gvec_usdot_idx_b) +TRANS_FEAT(BFDOT_vi, aa64_bf16, do_dot_vector_idx, a, + gen_helper_gvec_bfdot_idx) + +static bool trans_BFMLAL_vi(DisasContext *s, arg_qrrx_e *a) +{ + if (!dc_isar_feature(aa64_bf16, s)) { + return false; + } + if (fp_access_check(s)) { + /* Q bit selects BFMLALB vs BFMLALT. */ + gen_gvec_op4_fpst(s, true, a->rd, a->rn, a->rm, a->rd, 0, + (a->idx << 1) | a->q, + gen_helper_gvec_bfmlal_idx); + } + return true; +} + +static bool trans_FCMLA_vi(DisasContext *s, arg_FCMLA_vi *a) +{ + gen_helper_gvec_4_ptr *fn; + + if (!dc_isar_feature(aa64_fcma, s)) { + return false; + } + switch (a->esz) { + case MO_16: + if (!dc_isar_feature(aa64_fp16, s)) { + return false; + } + fn = gen_helper_gvec_fcmlah_idx; + break; + case MO_32: + fn = gen_helper_gvec_fcmlas_idx; + break; + default: + g_assert_not_reached(); + } + if (fp_access_check(s)) { + gen_gvec_op4_fpst(s, a->q, a->rd, a->rn, a->rm, a->rd, + a->esz == MO_16, (a->idx << 2) | a->rot, fn); + } + return true; +} + /* * Advanced SIMD scalar pairwise */ @@ -9536,84 +9743,6 @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn) } } -/* AdvSIMD scalar three same extra - * 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0 - * +-----+---+-----------+------+---+------+---+--------+---+----+----+ - * | 0 1 | U | 1 1 1 1 0 | size | 0 | Rm | 1 | opcode | 1 | Rn | Rd | - * +-----+---+-----------+------+---+------+---+--------+---+----+----+ - */ -static void disas_simd_scalar_three_reg_same_extra(DisasContext *s, - uint32_t insn) -{ - int rd = extract32(insn, 0, 5); - int rn = extract32(insn, 5, 5); - int opcode = extract32(insn, 11, 4); - int rm = extract32(insn, 16, 5); - int size = extract32(insn, 22, 2); - bool u = extract32(insn, 29, 1); - TCGv_i32 ele1, ele2, ele3; - TCGv_i64 res; - bool feature; - - switch (u * 16 + opcode) { - case 0x10: /* SQRDMLAH (vector) */ - case 0x11: /* SQRDMLSH (vector) */ - if (size != 1 && size != 2) { - unallocated_encoding(s); - return; - } - feature = dc_isar_feature(aa64_rdm, s); - break; - default: - unallocated_encoding(s); - return; - } - if (!feature) { - unallocated_encoding(s); - return; - } - if (!fp_access_check(s)) { - return; - } - - /* Do a single operation on the lowest element in the vector. - * We use the standard Neon helpers and rely on 0 OP 0 == 0 - * with no side effects for all these operations. - * OPTME: special-purpose helpers would avoid doing some - * unnecessary work in the helper for the 16 bit cases. - */ - ele1 = tcg_temp_new_i32(); - ele2 = tcg_temp_new_i32(); - ele3 = tcg_temp_new_i32(); - - read_vec_element_i32(s, ele1, rn, 0, size); - read_vec_element_i32(s, ele2, rm, 0, size); - read_vec_element_i32(s, ele3, rd, 0, size); - - switch (opcode) { - case 0x0: /* SQRDMLAH */ - if (size == 1) { - gen_helper_neon_qrdmlah_s16(ele3, tcg_env, ele1, ele2, ele3); - } else { - gen_helper_neon_qrdmlah_s32(ele3, tcg_env, ele1, ele2, ele3); - } - break; - case 0x1: /* SQRDMLSH */ - if (size == 1) { - gen_helper_neon_qrdmlsh_s16(ele3, tcg_env, ele1, ele2, ele3); - } else { - gen_helper_neon_qrdmlsh_s32(ele3, tcg_env, ele1, ele2, ele3); - } - break; - default: - g_assert_not_reached(); - } - - res = tcg_temp_new_i64(); - tcg_gen_extu_i32_i64(res, ele3); - write_fp_dreg(s, rd, res); -} - static void handle_2misc_64(DisasContext *s, int opcode, bool u, TCGv_i64 tcg_rd, TCGv_i64 tcg_rn, TCGv_i32 tcg_rmode, TCGv_ptr tcg_fpstatus) @@ -10873,194 +11002,6 @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn) } } -/* AdvSIMD three same extra - * 31 30 29 28 24 23 22 21 20 16 15 14 11 10 9 5 4 0 - * +---+---+---+-----------+------+---+------+---+--------+---+----+----+ - * | 0 | Q | U | 0 1 1 1 0 | size | 0 | Rm | 1 | opcode | 1 | Rn | Rd | - * +---+---+---+-----------+------+---+------+---+--------+---+----+----+ - */ -static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) -{ - int rd = extract32(insn, 0, 5); - int rn = extract32(insn, 5, 5); - int opcode = extract32(insn, 11, 4); - int rm = extract32(insn, 16, 5); - int size = extract32(insn, 22, 2); - bool u = extract32(insn, 29, 1); - bool is_q = extract32(insn, 30, 1); - bool feature; - int rot; - - switch (u * 16 + opcode) { - case 0x10: /* SQRDMLAH (vector) */ - case 0x11: /* SQRDMLSH (vector) */ - if (size != 1 && size != 2) { - unallocated_encoding(s); - return; - } - feature = dc_isar_feature(aa64_rdm, s); - break; - case 0x02: /* SDOT (vector) */ - case 0x12: /* UDOT (vector) */ - if (size != MO_32) { - unallocated_encoding(s); - return; - } - feature = dc_isar_feature(aa64_dp, s); - break; - case 0x03: /* USDOT */ - if (size != MO_32) { - unallocated_encoding(s); - return; - } - feature = dc_isar_feature(aa64_i8mm, s); - break; - case 0x04: /* SMMLA */ - case 0x14: /* UMMLA */ - case 0x05: /* USMMLA */ - if (!is_q || size != MO_32) { - unallocated_encoding(s); - return; - } - feature = dc_isar_feature(aa64_i8mm, s); - break; - case 0x18: /* FCMLA, #0 */ - case 0x19: /* FCMLA, #90 */ - case 0x1a: /* FCMLA, #180 */ - case 0x1b: /* FCMLA, #270 */ - case 0x1c: /* FCADD, #90 */ - case 0x1e: /* FCADD, #270 */ - if (size == 0 - || (size == 1 && !dc_isar_feature(aa64_fp16, s)) - || (size == 3 && !is_q)) { - unallocated_encoding(s); - return; - } - feature = dc_isar_feature(aa64_fcma, s); - break; - case 0x1d: /* BFMMLA */ - if (size != MO_16 || !is_q) { - unallocated_encoding(s); - return; - } - feature = dc_isar_feature(aa64_bf16, s); - break; - case 0x1f: - switch (size) { - case 1: /* BFDOT */ - case 3: /* BFMLAL{B,T} */ - feature = dc_isar_feature(aa64_bf16, s); - break; - default: - unallocated_encoding(s); - return; - } - break; - default: - unallocated_encoding(s); - return; - } - if (!feature) { - unallocated_encoding(s); - return; - } - if (!fp_access_check(s)) { - return; - } - - switch (opcode) { - case 0x0: /* SQRDMLAH (vector) */ - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size); - return; - - case 0x1: /* SQRDMLSH (vector) */ - gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size); - return; - - case 0x2: /* SDOT / UDOT */ - gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, - u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b); - return; - - case 0x3: /* USDOT */ - gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, gen_helper_gvec_usdot_b); - return; - - case 0x04: /* SMMLA, UMMLA */ - gen_gvec_op4_ool(s, 1, rd, rn, rm, rd, 0, - u ? gen_helper_gvec_ummla_b - : gen_helper_gvec_smmla_b); - return; - case 0x05: /* USMMLA */ - gen_gvec_op4_ool(s, 1, rd, rn, rm, rd, 0, gen_helper_gvec_usmmla_b); - return; - - case 0x8: /* FCMLA, #0 */ - case 0x9: /* FCMLA, #90 */ - case 0xa: /* FCMLA, #180 */ - case 0xb: /* FCMLA, #270 */ - rot = extract32(opcode, 0, 2); - switch (size) { - case 1: - gen_gvec_op4_fpst(s, is_q, rd, rn, rm, rd, true, rot, - gen_helper_gvec_fcmlah); - break; - case 2: - gen_gvec_op4_fpst(s, is_q, rd, rn, rm, rd, false, rot, - gen_helper_gvec_fcmlas); - break; - case 3: - gen_gvec_op4_fpst(s, is_q, rd, rn, rm, rd, false, rot, - gen_helper_gvec_fcmlad); - break; - default: - g_assert_not_reached(); - } - return; - - case 0xc: /* FCADD, #90 */ - case 0xe: /* FCADD, #270 */ - rot = extract32(opcode, 1, 1); - switch (size) { - case 1: - gen_gvec_op3_fpst(s, is_q, rd, rn, rm, size == 1, rot, - gen_helper_gvec_fcaddh); - break; - case 2: - gen_gvec_op3_fpst(s, is_q, rd, rn, rm, size == 1, rot, - gen_helper_gvec_fcadds); - break; - case 3: - gen_gvec_op3_fpst(s, is_q, rd, rn, rm, size == 1, rot, - gen_helper_gvec_fcaddd); - break; - default: - g_assert_not_reached(); - } - return; - - case 0xd: /* BFMMLA */ - gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, gen_helper_gvec_bfmmla); - return; - case 0xf: - switch (size) { - case 1: /* BFDOT */ - gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, gen_helper_gvec_bfdot); - break; - case 3: /* BFMLAL{B,T} */ - gen_gvec_op4_fpst(s, 1, rd, rn, rm, rd, false, is_q, - gen_helper_gvec_bfmlal); - break; - default: - g_assert_not_reached(); - } - return; - - default: - g_assert_not_reached(); - } -} - static void handle_2misc_widening(DisasContext *s, int opcode, bool is_q, int size, int rn, int rd) { @@ -12035,11 +11976,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) int h = extract32(insn, 11, 1); int rn = extract32(insn, 5, 5); int rd = extract32(insn, 0, 5); - bool is_long = false; - int is_fp = 0; - bool is_fp16 = false; int index; - TCGv_ptr fpst; switch (16 * u + opcode) { case 0x02: /* SMLAL, SMLAL2 */ @@ -12052,66 +11989,10 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) unallocated_encoding(s); return; } - is_long = true; break; case 0x03: /* SQDMLAL, SQDMLAL2 */ case 0x07: /* SQDMLSL, SQDMLSL2 */ case 0x0b: /* SQDMULL, SQDMULL2 */ - is_long = true; - break; - case 0x1d: /* SQRDMLAH */ - case 0x1f: /* SQRDMLSH */ - if (!dc_isar_feature(aa64_rdm, s)) { - unallocated_encoding(s); - return; - } - break; - case 0x0e: /* SDOT */ - case 0x1e: /* UDOT */ - if (is_scalar || size != MO_32 || !dc_isar_feature(aa64_dp, s)) { - unallocated_encoding(s); - return; - } - break; - case 0x0f: - switch (size) { - case 0: /* SUDOT */ - case 2: /* USDOT */ - if (is_scalar || !dc_isar_feature(aa64_i8mm, s)) { - unallocated_encoding(s); - return; - } - size = MO_32; - break; - case 1: /* BFDOT */ - if (is_scalar || !dc_isar_feature(aa64_bf16, s)) { - unallocated_encoding(s); - return; - } - size = MO_32; - break; - case 3: /* BFMLAL{B,T} */ - if (is_scalar || !dc_isar_feature(aa64_bf16, s)) { - unallocated_encoding(s); - return; - } - /* can't set is_fp without other incorrect size checks */ - size = MO_16; - break; - default: - unallocated_encoding(s); - return; - } - break; - case 0x11: /* FCMLA #0 */ - case 0x13: /* FCMLA #90 */ - case 0x15: /* FCMLA #180 */ - case 0x17: /* FCMLA #270 */ - if (is_scalar || !dc_isar_feature(aa64_fcma, s)) { - unallocated_encoding(s); - return; - } - is_fp = 2; break; default: case 0x00: /* FMLAL */ @@ -12122,55 +12003,30 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x09: /* FMUL */ case 0x0c: /* SQDMULH */ case 0x0d: /* SQRDMULH */ + case 0x0e: /* SDOT */ + case 0x0f: /* SUDOT / BFDOT / USDOT / BFMLAL */ case 0x10: /* MLA */ + case 0x11: /* FCMLA #0 */ + case 0x13: /* FCMLA #90 */ case 0x14: /* MLS */ + case 0x15: /* FCMLA #180 */ + case 0x17: /* FCMLA #270 */ case 0x18: /* FMLAL2 */ case 0x19: /* FMULX */ case 0x1c: /* FMLSL2 */ - unallocated_encoding(s); - return; - } - - switch (is_fp) { - case 1: /* normal fp */ - unallocated_encoding(s); /* in decodetree */ - return; - - case 2: /* complex fp */ - /* Each indexable element is a complex pair. */ - size += 1; - switch (size) { - case MO_32: - if (h && !is_q) { - unallocated_encoding(s); - return; - } - is_fp16 = true; - break; - case MO_64: - break; - default: - unallocated_encoding(s); - return; - } - break; - - default: /* integer */ - switch (size) { - case MO_8: - case MO_64: - unallocated_encoding(s); - return; - } - break; - } - if (is_fp16 && !dc_isar_feature(aa64_fp16, s)) { + case 0x1d: /* SQRDMLAH */ + case 0x1e: /* UDOT */ + case 0x1f: /* SQRDMLSH */ unallocated_encoding(s); return; } /* Given MemOp size, adjust register and indexing. */ switch (size) { + case MO_8: + case MO_64: + unallocated_encoding(s); + return; case MO_16: index = h << 2 | l << 1 | m; break; @@ -12178,14 +12034,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) index = h << 1 | l; rm |= m << 4; break; - case MO_64: - if (l || !is_q) { - unallocated_encoding(s); - return; - } - index = h; - rm |= m << 4; - break; default: g_assert_not_reached(); } @@ -12194,170 +12042,8 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) return; } - if (is_fp) { - fpst = fpstatus_ptr(is_fp16 ? FPST_FPCR_F16 : FPST_FPCR); - } else { - fpst = NULL; - } - - switch (16 * u + opcode) { - case 0x0e: /* SDOT */ - case 0x1e: /* UDOT */ - gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, - u ? gen_helper_gvec_udot_idx_b - : gen_helper_gvec_sdot_idx_b); - return; - case 0x0f: - switch (extract32(insn, 22, 2)) { - case 0: /* SUDOT */ - gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, - gen_helper_gvec_sudot_idx_b); - return; - case 1: /* BFDOT */ - gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, - gen_helper_gvec_bfdot_idx); - return; - case 2: /* USDOT */ - gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, - gen_helper_gvec_usdot_idx_b); - return; - case 3: /* BFMLAL{B,T} */ - gen_gvec_op4_fpst(s, 1, rd, rn, rm, rd, 0, (index << 1) | is_q, - gen_helper_gvec_bfmlal_idx); - return; - } - g_assert_not_reached(); - case 0x11: /* FCMLA #0 */ - case 0x13: /* FCMLA #90 */ - case 0x15: /* FCMLA #180 */ - case 0x17: /* FCMLA #270 */ - { - int rot = extract32(insn, 13, 2); - int data = (index << 2) | rot; - tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - vec_full_reg_offset(s, rd), fpst, - is_q ? 16 : 8, vec_full_reg_size(s), data, - size == MO_64 - ? gen_helper_gvec_fcmlas_idx - : gen_helper_gvec_fcmlah_idx); - } - return; - } - if (size == 3) { g_assert_not_reached(); - } else if (!is_long) { - /* 32 bit floating point, or 16 or 32 bit integer. - * For the 16 bit scalar case we use the usual Neon helpers and - * rely on the fact that 0 op 0 == 0 with no side effects. - */ - TCGv_i32 tcg_idx = tcg_temp_new_i32(); - int pass, maxpasses; - - if (is_scalar) { - maxpasses = 1; - } else { - maxpasses = is_q ? 4 : 2; - } - - read_vec_element_i32(s, tcg_idx, rm, index, size); - - if (size == 1 && !is_scalar) { - /* The simplest way to handle the 16x16 indexed ops is to duplicate - * the index into both halves of the 32 bit tcg_idx and then use - * the usual Neon helpers. - */ - tcg_gen_deposit_i32(tcg_idx, tcg_idx, tcg_idx, 16, 16); - } - - for (pass = 0; pass < maxpasses; pass++) { - TCGv_i32 tcg_op = tcg_temp_new_i32(); - TCGv_i32 tcg_res = tcg_temp_new_i32(); - - read_vec_element_i32(s, tcg_op, rn, pass, is_scalar ? size : MO_32); - - switch (16 * u + opcode) { - case 0x10: /* MLA */ - case 0x14: /* MLS */ - { - static NeonGenTwoOpFn * const fns[2][2] = { - { gen_helper_neon_add_u16, gen_helper_neon_sub_u16 }, - { tcg_gen_add_i32, tcg_gen_sub_i32 }, - }; - NeonGenTwoOpFn *genfn; - bool is_sub = opcode == 0x4; - - if (size == 1) { - gen_helper_neon_mul_u16(tcg_res, tcg_op, tcg_idx); - } else { - tcg_gen_mul_i32(tcg_res, tcg_op, tcg_idx); - } - if (opcode == 0x8) { - break; - } - read_vec_element_i32(s, tcg_op, rd, pass, MO_32); - genfn = fns[size - 1][is_sub]; - genfn(tcg_res, tcg_op, tcg_res); - break; - } - case 0x0c: /* SQDMULH */ - if (size == 1) { - gen_helper_neon_qdmulh_s16(tcg_res, tcg_env, - tcg_op, tcg_idx); - } else { - gen_helper_neon_qdmulh_s32(tcg_res, tcg_env, - tcg_op, tcg_idx); - } - break; - case 0x0d: /* SQRDMULH */ - if (size == 1) { - gen_helper_neon_qrdmulh_s16(tcg_res, tcg_env, - tcg_op, tcg_idx); - } else { - gen_helper_neon_qrdmulh_s32(tcg_res, tcg_env, - tcg_op, tcg_idx); - } - break; - case 0x1d: /* SQRDMLAH */ - read_vec_element_i32(s, tcg_res, rd, pass, - is_scalar ? size : MO_32); - if (size == 1) { - gen_helper_neon_qrdmlah_s16(tcg_res, tcg_env, - tcg_op, tcg_idx, tcg_res); - } else { - gen_helper_neon_qrdmlah_s32(tcg_res, tcg_env, - tcg_op, tcg_idx, tcg_res); - } - break; - case 0x1f: /* SQRDMLSH */ - read_vec_element_i32(s, tcg_res, rd, pass, - is_scalar ? size : MO_32); - if (size == 1) { - gen_helper_neon_qrdmlsh_s16(tcg_res, tcg_env, - tcg_op, tcg_idx, tcg_res); - } else { - gen_helper_neon_qrdmlsh_s32(tcg_res, tcg_env, - tcg_op, tcg_idx, tcg_res); - } - break; - default: - case 0x01: /* FMLA */ - case 0x05: /* FMLS */ - case 0x09: /* FMUL */ - case 0x19: /* FMULX */ - g_assert_not_reached(); - } - - if (is_scalar) { - write_fp_sreg(s, rd, tcg_res); - } else { - write_vec_element_i32(s, tcg_res, rd, pass, MO_32); - } - } - - clear_vec_high(s, is_q, rd); } else { /* long ops: 16x16->32 or 32x32->64 */ TCGv_i64 tcg_res[2]; @@ -12527,7 +12213,6 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) */ static const AArch64DecodeTable data_proc_simd[] = { /* pattern , mask , fn */ - { 0x0e008400, 0x9f208400, disas_simd_three_reg_same_extra }, { 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff }, { 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc }, { 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes }, @@ -12538,7 +12223,6 @@ static const AArch64DecodeTable data_proc_simd[] = { { 0x0e000000, 0xbf208c00, disas_simd_tb }, { 0x0e000800, 0xbf208c00, disas_simd_zip_trn }, { 0x2e000000, 0xbf208400, disas_simd_ext }, - { 0x5e008400, 0xdf208400, disas_simd_scalar_three_reg_same_extra }, { 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff }, { 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc }, { 0x5f000000, 0xdf000400, disas_simd_indexed }, /* scalar indexed */ diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index b05922b..98604d1 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -317,10 +317,12 @@ void HELPER(neon_sqdmulh_idx_h)(void *vd, void *vn, void *vm, intptr_t i, j, opr_sz = simd_oprsz(desc); int idx = simd_data(desc); int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + intptr_t elements = opr_sz / 2; + intptr_t eltspersegment = MIN(16 / 2, elements); - for (i = 0; i < opr_sz / 2; i += 16 / 2) { + for (i = 0; i < elements; i += 16 / 2) { int16_t mm = m[i]; - for (j = 0; j < 16 / 2; ++j) { + for (j = 0; j < eltspersegment; ++j) { d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, false, vq); } } @@ -333,16 +335,54 @@ void HELPER(neon_sqrdmulh_idx_h)(void *vd, void *vn, void *vm, intptr_t i, j, opr_sz = simd_oprsz(desc); int idx = simd_data(desc); int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + intptr_t elements = opr_sz / 2; + intptr_t eltspersegment = MIN(16 / 2, elements); - for (i = 0; i < opr_sz / 2; i += 16 / 2) { + for (i = 0; i < elements; i += 16 / 2) { int16_t mm = m[i]; - for (j = 0; j < 16 / 2; ++j) { + for (j = 0; j < eltspersegment; ++j) { d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, true, vq); } } clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(neon_sqrdmlah_idx_h)(void *vd, void *vn, void *vm, + void *vq, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + intptr_t elements = opr_sz / 2; + intptr_t eltspersegment = MIN(16 / 2, elements); + + for (i = 0; i < elements; i += 16 / 2) { + int16_t mm = m[i]; + for (j = 0; j < eltspersegment; ++j) { + d[i + j] = do_sqrdmlah_h(n[i + j], mm, d[i + j], false, true, vq); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(neon_sqrdmlsh_idx_h)(void *vd, void *vn, void *vm, + void *vq, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + intptr_t elements = opr_sz / 2; + intptr_t eltspersegment = MIN(16 / 2, elements); + + for (i = 0; i < elements; i += 16 / 2) { + int16_t mm = m[i]; + for (j = 0; j < eltspersegment; ++j) { + d[i + j] = do_sqrdmlah_h(n[i + j], mm, d[i + j], true, true, vq); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(sve2_sqrdmlah_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { @@ -512,10 +552,12 @@ void HELPER(neon_sqdmulh_idx_s)(void *vd, void *vn, void *vm, intptr_t i, j, opr_sz = simd_oprsz(desc); int idx = simd_data(desc); int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + intptr_t elements = opr_sz / 4; + intptr_t eltspersegment = MIN(16 / 4, elements); - for (i = 0; i < opr_sz / 4; i += 16 / 4) { + for (i = 0; i < elements; i += 16 / 4) { int32_t mm = m[i]; - for (j = 0; j < 16 / 4; ++j) { + for (j = 0; j < eltspersegment; ++j) { d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, false, vq); } } @@ -528,16 +570,54 @@ void HELPER(neon_sqrdmulh_idx_s)(void *vd, void *vn, void *vm, intptr_t i, j, opr_sz = simd_oprsz(desc); int idx = simd_data(desc); int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + intptr_t elements = opr_sz / 4; + intptr_t eltspersegment = MIN(16 / 4, elements); - for (i = 0; i < opr_sz / 4; i += 16 / 4) { + for (i = 0; i < elements; i += 16 / 4) { int32_t mm = m[i]; - for (j = 0; j < 16 / 4; ++j) { + for (j = 0; j < eltspersegment; ++j) { d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, true, vq); } } clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(neon_sqrdmlah_idx_s)(void *vd, void *vn, void *vm, + void *vq, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + intptr_t elements = opr_sz / 4; + intptr_t eltspersegment = MIN(16 / 4, elements); + + for (i = 0; i < elements; i += 16 / 4) { + int32_t mm = m[i]; + for (j = 0; j < eltspersegment; ++j) { + d[i + j] = do_sqrdmlah_s(n[i + j], mm, d[i + j], false, true, vq); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(neon_sqrdmlsh_idx_s)(void *vd, void *vn, void *vm, + void *vq, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + intptr_t elements = opr_sz / 4; + intptr_t eltspersegment = MIN(16 / 4, elements); + + for (i = 0; i < elements; i += 16 / 4) { + int32_t mm = m[i]; + for (j = 0; j < eltspersegment; ++j) { + d[i + j] = do_sqrdmlah_s(n[i + j], mm, d[i + j], true, true, vq); + } + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(sve2_sqrdmlah_s)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { @@ -907,7 +987,7 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, void *va, intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; intptr_t elements = opr_sz / sizeof(float16); - intptr_t eltspersegment = 16 / sizeof(float16); + intptr_t eltspersegment = MIN(16 / sizeof(float16), elements); intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ @@ -969,7 +1049,7 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, void *va, intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; intptr_t elements = opr_sz / sizeof(float32); - intptr_t eltspersegment = 16 / sizeof(float32); + intptr_t eltspersegment = MIN(16 / sizeof(float32), elements); intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index ce26b8a..50d7042 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -1091,8 +1091,8 @@ const FloatRoundMode arm_rmode_to_sf_map[] = { uint64_t HELPER(fjcvtzs)(float64 value, void *vstatus) { float_status *status = vstatus; - uint32_t inexact, frac; - uint32_t e_old, e_new; + uint32_t frac, e_old, e_new; + bool inexact; e_old = get_float_exception_flags(status); set_float_exception_flags(0, status); @@ -1100,13 +1100,13 @@ uint64_t HELPER(fjcvtzs)(float64 value, void *vstatus) e_new = get_float_exception_flags(status); set_float_exception_flags(e_old | e_new, status); - if (value == float64_chs(float64_zero)) { - /* While not inexact for IEEE FP, -0.0 is inexact for JavaScript. */ - inexact = 1; - } else { - /* Normal inexact or overflow or NaN */ - inexact = e_new & (float_flag_inexact | float_flag_invalid); - } + /* Normal inexact, denormal with flush-to-zero, or overflow or NaN */ + inexact = e_new & (float_flag_inexact | + float_flag_input_denormal | + float_flag_invalid); + + /* While not inexact for IEEE FP, -0.0 is inexact for JavaScript. */ + inexact |= value == float64_chs(float64_zero); /* Pack the result and the env->ZF representation of Z together. */ return deposit64(frac, 32, 32, inexact); diff --git a/tests/avocado/machine_aarch64_sbsaref.py b/tests/avocado/machine_aarch64_sbsaref.py index 6bb82f2..e920bbf 100644 --- a/tests/avocado/machine_aarch64_sbsaref.py +++ b/tests/avocado/machine_aarch64_sbsaref.py @@ -37,18 +37,18 @@ class Aarch64SbsarefMachine(QemuSystemTest): Used components: - - Trusted Firmware 2.11.0 - - Tianocore EDK2 stable202405 - - Tianocore EDK2-platforms commit 4bbd0ed + - Trusted Firmware v2.11.0 + - Tianocore EDK2 4d4f569924 + - Tianocore EDK2-platforms 3f08401 """ # Secure BootRom (TF-A code) fs0_xz_url = ( "https://artifacts.codelinaro.org/artifactory/linaro-419-sbsa-ref/" - "20240528-140808/edk2/SBSA_FLASH0.fd.xz" + "20240619-148232/edk2/SBSA_FLASH0.fd.xz" ) - fs0_xz_hash = "fa6004900b67172914c908b78557fec4d36a5f784f4c3dd08f49adb75e1892a9" + fs0_xz_hash = "0c954842a590988f526984de22e21ae0ab9cb351a0c99a8a58e928f0c7359cf7" tar_xz_path = self.fetch_asset(fs0_xz_url, asset_hash=fs0_xz_hash, algorithm='sha256') archive.extract(tar_xz_path, self.workdir) @@ -57,9 +57,9 @@ class Aarch64SbsarefMachine(QemuSystemTest): # Non-secure rom (UEFI and EFI variables) fs1_xz_url = ( "https://artifacts.codelinaro.org/artifactory/linaro-419-sbsa-ref/" - "20240528-140808/edk2/SBSA_FLASH1.fd.xz" + "20240619-148232/edk2/SBSA_FLASH1.fd.xz" ) - fs1_xz_hash = "5f3747d4000bc416d9641e33ff4ac60c3cc8cb74ca51b6e932e58531c62eb6f7" + fs1_xz_hash = "c6ec39374c4d79bb9e9cdeeb6db44732d90bb4a334cec92002b3f4b9cac4b5ee" tar_xz_path = self.fetch_asset(fs1_xz_url, asset_hash=fs1_xz_hash, algorithm='sha256') archive.extract(tar_xz_path, self.workdir) @@ -75,8 +75,6 @@ class Aarch64SbsarefMachine(QemuSystemTest): f"if=pflash,file={fs0_path},format=raw", "-drive", f"if=pflash,file={fs1_path},format=raw", - "-smp", - "1", "-machine", "sbsa-ref", ) diff --git a/tests/qtest/stm32l4x5_exti-test.c b/tests/qtest/stm32l4x5_exti-test.c index 7092860..7e39c99 100644 --- a/tests/qtest/stm32l4x5_exti-test.c +++ b/tests/qtest/stm32l4x5_exti-test.c @@ -448,6 +448,9 @@ static void test_masked_interrupt(void) g_assert_cmphex(exti_readl(EXTI_PR1), ==, 0x00000000); /* Check that the interrupt isn't pending in NVIC */ g_assert_false(check_nvic_pending(EXTI1_IRQ)); + + /* Clean EXTI */ + exti_set_irq(1, 0); } static void test_interrupt(void) @@ -498,6 +501,9 @@ static void test_interrupt(void) /* Clean NVIC */ unpend_nvic_irq(EXTI1_IRQ); g_assert_false(check_nvic_pending(EXTI1_IRQ)); + + /* Clean EXTI */ + exti_set_irq(1, 0); } static void test_orred_interrupts(void) @@ -531,6 +537,8 @@ static void test_orred_interrupts(void) unpend_nvic_irq(EXTI5_9_IRQ); g_assert_false(check_nvic_pending(EXTI5_9_IRQ)); + + exti_set_irq(i, 0); } } diff --git a/tests/qtest/stm32l4x5_syscfg-test.c b/tests/qtest/stm32l4x5_syscfg-test.c index 506ca08..258417c 100644 --- a/tests/qtest/stm32l4x5_syscfg-test.c +++ b/tests/qtest/stm32l4x5_syscfg-test.c @@ -221,10 +221,10 @@ static void test_interrupt(void) g_assert_true(get_irq(1)); /* Clean the test */ - syscfg_writel(SYSCFG_EXTICR1, 0x00000000); syscfg_set_irq(0, 0); - syscfg_set_irq(15, 0); + /* irq 15 is high at reset because GPIOA15 is high at reset */ syscfg_set_irq(17, 0); + syscfg_writel(SYSCFG_EXTICR1, 0x00000000); } static void test_irq_pin_multiplexer(void) @@ -237,21 +237,21 @@ static void test_irq_pin_multiplexer(void) syscfg_set_irq(0, 1); - /* Check that irq 0 was set and irq 15 wasn't */ + /* Check that irq 0 was set and irq 2 wasn't */ g_assert_true(get_irq(0)); - g_assert_false(get_irq(15)); + g_assert_false(get_irq(2)); /* Clean the test */ syscfg_set_irq(0, 0); - syscfg_set_irq(15, 1); + syscfg_set_irq(2, 1); - /* Check that irq 15 was set and irq 0 wasn't */ - g_assert_true(get_irq(15)); + /* Check that irq 2 was set and irq 0 wasn't */ + g_assert_true(get_irq(2)); g_assert_false(get_irq(0)); /* Clean the test */ - syscfg_set_irq(15, 0); + syscfg_set_irq(2, 0); } static void test_irq_gpio_multiplexer(void) diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target index 70d728a..4ecbca6 100644 --- a/tests/tcg/aarch64/Makefile.target +++ b/tests/tcg/aarch64/Makefile.target @@ -41,8 +41,9 @@ endif # Pauth Tests ifneq ($(CROSS_CC_HAS_ARMV8_3),) -AARCH64_TESTS += pauth-1 pauth-2 pauth-4 pauth-5 +AARCH64_TESTS += pauth-1 pauth-2 pauth-4 pauth-5 test-2375 pauth-%: CFLAGS += -march=armv8.3-a +test-2375: CFLAGS += -march=armv8.3-a run-pauth-1: QEMU_OPTS += -cpu max run-pauth-2: QEMU_OPTS += -cpu max # Choose a cpu with FEAT_Pauth but without FEAT_FPAC for pauth-[45]. diff --git a/tests/tcg/aarch64/test-2375.c b/tests/tcg/aarch64/test-2375.c new file mode 100644 index 0000000..84c7e7d --- /dev/null +++ b/tests/tcg/aarch64/test-2375.c @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* Copyright (c) 2024 Linaro Ltd */ +/* See https://gitlab.com/qemu-project/qemu/-/issues/2375 */ + +#include <assert.h> + +int main(void) +{ + int r, z; + + asm("msr fpcr, %2\n\t" + "fjcvtzs %w0, %d3\n\t" + "cset %1, eq" + : "=r"(r), "=r"(z) + : "r"(0x01000000L), /* FZ = 1 */ + "w"(0xfcff00L)); /* denormal */ + + assert(r == 0); + assert(z == 0); + return 0; +} |