aboutsummaryrefslogtreecommitdiff
path: root/target/ppc/fpu_helper.c
AgeCommit message (Collapse)AuthorFilesLines
2024-07-26target/ppc: Move VSX fp compare insns to decodetree.Chinmay Rath1-8/+8
Moving the following instructions to decodetree specification: xvcmp{eq, gt, ge, ne}{s, d}p : XX3-form The changes were verified by validating that the tcg-ops generated for those instructions remain the same which were captured using the '-d in_asm,op' flag. Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-07-26target/ppc: Move VSX arithmetic and max/min insns to decodetree.Chinmay Rath1-22/+22
Moving the following instructions to decodetree specification: x{s, v}{add, sub, mul, div}{s, d}p : XX3-form xs{max, min}dp, xv{max, min}{s, d}p : XX3-form The changes were verfied by validating that the tcg ops generated by those instructions remain the same, which were captured with the '-d in_asm,op' flag. Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24target/ppc: Move floating-point arithmetic instructions to decodetree.Chinmay Rath1-19/+19
This patch moves the below instructions to decodetree specification : f{add, sub, mul, div, re, rsqrte, madd, msub, nmadd, nmsub}[s][.] : A-form ft{div, sqrt} : X-form With this patch, all the floating-point arithmetic instructions have been moved to decodetree. The changes were verified by validating that the tcg ops generated by those instructions remain the same, which were captured with the '-d in_asm,op' flag. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2024-05-24target/ppc: Merge various fpu helpersChinmay Rath1-159/+62
This patch merges the definitions of the following set of fpu helper methods, which are similar, using macros : 1. f{add, sub, mul, div}(s) 2. fre(s) 3. frsqrte(s) Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
2023-11-21target/ppc: Fix bugs in VSX_CVT_FP_TO_INT and VSX_CVT_FP_TO_INT2 macrosJohn Platts1-4/+8
The patch below fixes a bug in the VSX_CVT_FP_TO_INT and VSX_CVT_FP_TO_INT2 macros in target/ppc/fpu_helper.c where a non-NaN floating point value from the source vector is incorrectly converted to 0, 0x80000000, or 0x8000000000000000 instead of the expected value if a preceding source floating point value from the same source vector was a NaN. The bug in the VSX_CVT_FP_TO_INT and VSX_CVT_FP_TO_INT2 macros in target/ppc/fpu_helper.c was introduced with commit c3f24257e3c0. This patch also adds a new vsx_f2i_nan test in tests/tcg/ppc64 that checks that the VSX xvcvspsxws, xvcvspuxws, xvcvspsxds, xvcvspuxds, xvcvdpsxws, xvcvdpuxws, xvcvdpsxds, and xvcvdpuxds instructions correctly convert non-NaN floating point values to integer values if the source vector contains NaN floating point values. Fixes: c3f24257e3c0 ("target/ppc: Clear fpstatus flags on helpers missing it") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1941 Signed-off-by: John Platts <john_platts@hotmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-05-28target/ppc: Merge COMPUTE_CLASS and COMPUTE_FPRFRichard Henderson1-56/+22
Instead of computing an artificial "class" bitmask then converting that to the fprf value, compute the final value from the start. Reorder the tests to check the most likely cases first. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230523202507.688859-1-richard.henderson@linaro.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Moved XSTSTDC[QDS]P to decodetreeLucas Mateus Castro (alqotel)1-80/+34
Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of its decoding away from the helper as previously the DCMX, XB and BF were calculated in the helper with the help of cpu_env, now that part was moved to the decodetree with the rest. xvtstdcsp: rept loop master patch 8 12500 1,85393600 1,94683600 (+5.0%) 25 4000 1,78779800 1,92479000 (+7.7%) 100 1000 2,12775000 2,28895500 (+7.6%) 500 200 2,99655300 3,23102900 (+7.8%) 2500 40 6,89082200 7,44827500 (+8.1%) 8000 12 17,50585500 18,95152100 (+8.3%) xvtstdcdp: rept loop master patch 8 12500 1,39043100 1,33539800 (-4.0%) 25 4000 1,35731800 1,37347800 (+1.2%) 100 1000 1,51514800 1,56053000 (+3.0%) 500 200 2,21014400 2,47906000 (+12.2%) 2500 40 5,39488200 6,68766700 (+24.0%) 8000 12 13,98623900 18,17661900 (+30.0%) xvtstdcdp: rept loop master patch 8 12500 1,35123800 1,34455800 (-0.5%) 25 4000 1,36441200 1,36759600 (+0.2%) 100 1000 1,49763500 1,54138400 (+2.9%) 500 200 2,19020200 2,46196400 (+12.4%) 2500 40 5,39265700 6,68147900 (+23.9%) 8000 12 14,04163600 18,19669600 (+29.6%) As some values are now decoded outside the helper and passed to it as an argument the number of arguments of the helper increased, the number of TCGop needed to load the arguments increased. I suspect that's why the slow-down in the tests with a high REPT but low LOOP. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-12-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-10-28target/ppc: Moved XVTSTDC[DS]P to decodetreeLucas Mateus Castro (alqotel)1-2/+37
Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper to be simpler and do all decoding in the decodetree (so XB, XT and DCMX are all calculated outside the helper). Obs: The tests in this one are slightly different, these are the sum of these instructions with all possible immediate and those instructions are repeated 10 times. xvtstdcsp: rept loop master patch 8 12500 2,76402100 2,70699100 (-2.1%) 25 4000 2,64867100 2,67884100 (+1.1%) 100 1000 2,73806300 2,78701000 (+1.8%) 500 200 3,44666500 3,61027600 (+4.7%) 2500 40 5,85790200 6,47475500 (+10.5%) 8000 12 15,22102100 17,46062900 (+14.7%) xvtstdcdp: rept loop master patch 8 12500 2,11818000 1,61065300 (-24.0%) 25 4000 2,04573400 1,60132200 (-21.7%) 100 1000 2,13834100 1,69988100 (-20.5%) 500 200 2,73977000 2,48631700 (-9.3%) 2500 40 5,05067000 5,25914100 (+4.1%) 8000 12 14,60507800 15,93704900 (+9.1%) Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-11-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-09-20target/ppc: Clear fpstatus flags on helpers missing itVíctor Colombo1-11/+26
In ppc emulation, exception flags are not cleared at the end of an instruction. Instead, the next instruction is responsible to clear it before its emulation. However, some helpers are not doing it, causing an issue where the previously set exception flags are being used and leading to incorrect values being set in FPSCR. Fix this by clearing fp_status before doing the instruction 'real' work for the following helpers that were missing this behavior: - VSX_CVT_INT_TO_FP_VECTOR - VSX_CVT_FP_TO_FP - VSX_CVT_FP_TO_INT_VECTOR - VSX_CVT_FP_TO_INT2 - VSX_CVT_FP_TO_INT - VSX_CVT_FP_TO_FP_HP - VSX_CVT_FP_TO_FP_VECTOR - VSX_CMP - VSX_ROUND - xscvqpdp - xscvdpsp[n] Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20220906125523.38765-9-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-09-20target/ppc: Zero second doubleword for VSX madd instructionsVíctor Colombo1-1/+1
In 205eb5a89e we updated most VSX instructions to zero the second doubleword, as is requested by PowerISA since v3.1. However, VSX_MADD helper was left behind unchanged, while it is also affected and should be fixed as well. This patch applies the fix for MADD instructions. Fixes: 205eb5a89e ("target/ppc: Change VSX instructions behavior to fill with zeros") Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20220906125523.38765-6-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-09-20target/ppc: Merge fsqrt and fsqrts helpersVíctor Colombo1-22/+13
These two helpers are almost identical, differing only by the softfloat operation it calls. Merge them into one using a macro. Also, take this opportunity to capitalize the helper name as we moved the instruction to decodetree in a previous patch. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220905123746.54659-4-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-08-31target/ppc: Bugfix FP when OE/UE are setLucas Mateus Castro (alqotel)1-2/+0
When an overflow exception occurs and OE is set the intermediate result should be adjusted (by subtracting from the exponent) to avoid rounding to inf. The same applies to an underflow exceptionion and UE (but adding to the exponent). To do this set the fp_status.rebias_overflow when OE is set and fp_status.rebias_underflow when UE is set as the FPU will recalculate in case of a overflow/underflow if the according rebias* is set. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220805141522.412864-3-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-06-20target/ppc: fix unreachable code in fpu_helper.cDaniel Henrique Barboza1-1/+1
Commit c29018cc7395 added an env->fpscr OR operation using a ternary that checks if 'error' is not zero: env->fpscr |= error ? FP_FEX : 0; However, in the current body of do_fpscr_check_status(), 'error' is granted to be always non-zero at that point. The result is that Coverity is less than pleased: Control flow issues (DEADCODE) Execution cannot reach the expression "0ULL" inside this statement: "env->fpscr |= (error ? 1073...". Remove the ternary and always make env->fpscr |= FP_FEX. Cc: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Cc: Richard Henderson <richard.henderson@linaro.org> Fixes: Coverity CID 1489442 Fixes: c29018cc7395 ("target/ppc: Implemented xvf*ger*") Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20220602191048.137511-1-danielhb413@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implemented [pm]xvbf16ger2*Lucas Mateus Castro (alqotel)1-0/+40
Implement the following PowerISA v3.1 instructions: xvbf16ger2: VSX Vector bfloat16 GER (rank-2 update) xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate xvbf16ger2np: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate xvbf16ger2pn: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate xvbf16ger2pp: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate pmxvbf16ger2: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) pmxvbf16ger2nn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate pmxvbf16ger2np: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate pmxvbf16ger2pn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate pmxvbf16ger2pp: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-8-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implemented xvf16ger*Lucas Mateus Castro (alqotel)1-0/+95
Implement the following PowerISA v3.1 instructions: xvf16ger2: VSX Vector 16-bit Floating-Point GER (rank-2 update) xvf16ger2nn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate xvf16ger2np: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate xvf16ger2pn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate xvf16ger2pp: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-6-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Implemented xvf*ger*Lucas Mateus Castro (alqotel)1-2/+192
Implement the following PowerISA v3.1 instructions: xvf32ger: VSX Vector 32-bit Floating-Point GER (rank-1 update) xvf32gernn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf32gernp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf32gerpn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf32gerpp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate xvf64ger: VSX Vector 64-bit Floating-Point GER (rank-1 update) xvf64gernn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf64gernp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf64gerpn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf64gerpp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-5-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: declare xvxsigsp helper with call flagsMatheus Ferst1-1/+1
Move xvxsigsp to decodetree, declare helper_xvxsigsp with TCG_CALL_NO_RWG, and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-8-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: declare xscvspdpn helper with call flagsMatheus Ferst1-1/+1
Move xscvspdpn to decodetree, declare helper_xscvspdpn with TCG_CALL_NO_RWG_SE and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-7-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Use TCG_CALL_NO_RWG_SE in fsel helperMatheus Ferst1-8/+7
fsel doesn't change FPSCR and CR1 is handled by gen_set_cr1_from_fpscr, so helper_fsel doesn't need the env argument and can be declared with TCG_CALL_NO_RWG_SE. We also take this opportunity to move the insn to decodetree. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-6-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Rename sfprf to sfifprf where it's also used as set fi flagVíctor Colombo1-56/+56
The bit FI fix used the sfprf flag as a flag for the set_fi parameter in do_float_check_status where applicable. Now, this patch rename this flag to sfifprf to state this dual usage. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Message-Id: <20220517161522.36132-4-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Fix FPSCR.FI changing in float_overflow_excp()Víctor Colombo1-6/+7
This patch fixes another not-so-clear situation in Power ISA regarding the inexact bits in FPSCR. The ISA states that: """ When Overflow Exception is disabled (OE=0) and an Overflow Exception occurs, the following actions are taken: ... 2. Inexact Exception is set XX <- 1 ... FI is set to 1 ... """ However, when tested on a Power 9 hardware, some instructions that trigger an OX don't set the FI bit: xvcvdpsp(0x4050533fcdb7b95ff8d561c40bf90996) = FI: CLEARED -> CLEARED xvnmsubmsp(0xf3c0c1fc8f3230, 0xbeaab9c5) = FI: CLEARED -> CLEARED (just a few examples. Other instructions are also affected) The root cause for this seems to be that only instructions that list the bit FI in the "Special Registers Altered" should modify it. QEMU is, today, not working like the hardware: xvcvdpsp(0x4050533fcdb7b95ff8d561c40bf90996) = FI: CLEARED -> SET xvnmsubmsp(0xf3c0c1fc8f3230, 0xbeaab9c5) = FI: CLEARED -> SET (all tests assume FI is cleared beforehand) Fix this by making float_overflow_excp() return float_flag_inexact if it should update the inexact flags. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Message-Id: <20220517161522.36132-3-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-26target/ppc: Fix FPSCR.FI bit being cleared when it shouldn'tVíctor Colombo1-58/+64
According to Power ISA, the FI bit in FPSCR is non-sticky. This means that if an instruction is said to modify the FI bit, then it should be set or cleared depending on the result of the instruction. Otherwise, it should be kept as was before. However, the following inconsistency was found when comparing results from the hardware (tested on both a Power 9 processor and in Power 10 Mambo): (FI bit is set before the execution of the instruction) Hardware: xscmpeqdp(0xff..ff, 0xff..ff) = FI: SET -> SET QEMU: xscmpeqdp(0xff..ff, 0xff..ff) = FI: SET -> CLEARED As the FI bit is non-sticky, and xscmpeqdp does not list it as a field that is changed by the instruction, it should not be changed after its execution. This is happening to multiple instructions in the vsx implementations. If the ISA does not list the FI bit as altered for a particular instruction, then it should be kept as it was before the instruction. QEMU is not following this behavior. Affected instructions include: - xv* (all vsx-vector instructions); - xscmp*, xsmax*, xsmin*; - xstdivdp and similars; (to identify the affected instructions, just search in the ISA for the instructions that does not list FI in "Special Registers Altered") Most instructions use the function do_float_check_status() to commit changes in the inexact flag. So the fix is to add a parameter to it that will control if the bit FI should be changed or not. All users of do_float_check_status() are then modified to provide this argument, controlling if that specific instruction changes bit FI or not. Some macro helpers are responsible for both instructions that change and instructions that aren't suposed to change FI. This seems to always overlap with the sfprf flag. So, reuse this flag for this purpose when applicable. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517161522.36132-2-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-05-05target/ppc: Remove fpscr_* macros from cpu.hVíctor Colombo1-14/+14
fpscr_* defined macros are hiding the usage of *env behind them. Substitute the usage of these macros with `env->fpscr & FP_*` to make the code cleaner. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Message-Id: <20220504210541.115256-2-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-20target/ppc: implement xscvqp[su]qzMatheus Ferst1-0/+21
Implement the following PowerISA v3.1 instructions: xscvqpsqz: VSX Scalar Convert with round to zero Quad-Precision to Signed Quadword xscvqpuqz: VSX Scalar Convert with round to zero Quad-Precision to Unsigned Quadword Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220330175932.6995-9-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-04-20target/ppc: implement xscv[su]qqpMatheus Ferst1-0/+12
Implement the following PowerISA v3.1 instructions: xscvsqqp: VSX Scalar Convert with round Signed Quadword to Quad-Precision xscvuqqp: VSX Scalar Convert with round Unsigned Quadword to Quad-Precision format Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220330175932.6995-8-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
2022-03-20target/ppc: Replicate Double->Single-Precision resultLucas Coutinho1-4/+44
Power ISA v3.1 formalizes the previously undefined result in words 1 and 3 to be a copy of the result in words 0 and 2. This affects: xvcvsxdsp, xvcvuxdsp, xvcvdpsp. And the previously undefined result in word 1 to be a copy of the result in word 0. This affects: xscvdpsp. Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br> Message-Id: <20220316200427.3410437-1-lucas.coutinho@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-20target/ppc: Replicate double->int32 result for some vector insnsRichard Henderson1-6/+39
Power ISA v3.1 formalizes the previously undefined result in words 1 and 3 to be a copy of the result in words 0 and 2. This affects: xscvdpsxws, xscvdpuxws, xvcvdpsxws, xvcvdpuxws. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/852 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> [ clg: checkpatch fixes ] Message-Id: <20220315053934.377519-1-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-05target/ppc: Add missing helper_reset_fpstatus to helper_XVCVSPBF16Víctor Colombo1-0/+2
Fixes: 3909ff1fac ("target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions") Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220304175156.2012315-8-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-05target/ppc: Add missing helper_reset_fpstatus to VSX_MAX_MINCVíctor Colombo1-0/+2
Fixes: da499405aa ("target/ppc: Refactor VSX_MAX_MINC helper") Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220304175156.2012315-7-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-05target/ppc: change xs[n]madd[am]sp to use float64r32_muladdMatheus Ferst1-38/+20
Change VSX Scalar Multiply-Add/Subtract Type-A/M Single Precision helpers to use float64r32_muladd. This method should correctly handle all rounding modes, so the workaround for float_round_nearest_even can be dropped. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220304165417.1981159-3-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructionsVíctor Colombo1-0/+18
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220225210936.1749575-47-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xs{max,min}cqpVíctor Colombo1-0/+2
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-46-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Refactor VSX_MAX_MINC helperVíctor Colombo1-24/+17
Refactor xs{max,min}cdp VSX_MAX_MINC helper to prepare for xs{max,min}cqp implementation. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-45-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3Víctor Colombo1-4/+4
Also, fixes these instructions not being capitalized. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-44-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Move xscmp{eq,ge,gt}dp to decodetreeVíctor Colombo1-3/+3
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-43-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Implement xscmp{eq,ge,gt}qpVíctor Colombo1-0/+3
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-42-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Refactor VSX_SCALAR_CMP_DPVíctor Colombo1-35/+29
Refactor VSX_SCALAR_CMP_DP, changing its name to VSX_SCALAR_CMP and prepare the helper to be used for quadword comparisons. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220225210936.1749575-41-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: Remove xscmpnedp instructionVíctor Colombo1-1/+0
xscmpnedp was added in ISA v3.0 but removed in v3.0B. This patch removes this instruction as it was not in the final version of v3.0. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Acked-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-40-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o]Matheus Ferst1-0/+42
Implement the following PowerISA v3.0 instuctions: xsmaddqp[o]: VSX Scalar Multiply-Add Quad-Precision [using round to Odd] xsmsubqp[o]: VSX Scalar Multiply-Subtract Quad-Precision [using round to Odd] xsnmaddqp[o]: VSX Scalar Negative Multiply-Add Quad-Precision [using round to Odd] xsnmsubqp[o]: VSX Scalar Negative Multiply-Subtract Quad-Precision [using round to Odd] Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-38-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetreeMatheus Ferst1-11/+12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-37-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-03-02target/ppc: move xxperm/xxpermr to decodetreeMatheus Ferst1-21/+0
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20220225210936.1749575-31-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-02-09target/ppc: Change VSX instructions behavior to fill with zerosVíctor Colombo1-13/+13
ISA v3.1 changed some VSX instructions behavior by changing what the other words/doubleword in the result should contain when the result is only one word/doubleword. e.g. xsmaxdp operates on doubleword 0 and saves the result also in doubleword 0. Before, the second doubleword result was undefined according to the ISA, but now it's stated that it should be zeroed. Even tough the result was undefined before, hardware implementing these instructions already filled these fields with 0s. Changing every ISA version in QEMU to this behavior makes the results match what happens in hardware. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220204181944.65063-1-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04target/ppc: do not silence snan in xscvspdpnMatheus Ferst1-4/+1
The non-signalling versions of VSX scalar convert to shorter/longer precision insns doesn't silence SNaNs in the hardware. To better match this behavior, use the non-arithmatic conversion of helper_todouble instead of float32_to_float64. A test is added to prevent future regressions. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211228120310.1957990-1-matheus.ferst@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: move xscvqpdp to decodetreeMatheus Ferst1-7/+3
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211213120958.24443-5-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: Fix xs{max, min}[cj]dp to use VSX registersVictor Colombo1-2/+2
PPC instruction xsmaxcdp, xsmincdp, xsmaxjdp, and xsminjdp are using vector registers when they should be using VSX ones. This happens because the instructions are using GEN_VSX_HELPER_R3, which adds 32 to the register numbers, effectively making them vector registers. This patch fixes it by changing these instructions to use GEN_VSX_HELPER_X3. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Victor Colombo <victor.colombo@eldorado.org.br> Message-Id: <20211213120958.24443-2-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: Use helper_todouble/tosingle in helper_xststdcspRichard Henderson1-11/+10
When computing the predicate "is this value currently formatted for single precision", we do not want to round the value according to the current rounding mode, nor perform a floating-point equality. We want to see if the N bits that make up single-precision are the only ones set within the register, and then a bitwise equality. Fixes a bug in which a single-precision NaN is considered !SP, because float64_eq(nan, nan) is always false. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-35-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: Update fres to new flags and float64r32Richard Henderson1-10/+10
There is no double-rounding bug here, because the result is merely an estimate to within 1 part in 256, but perform the operation with float64r32_div for consistency. Use float_flag_invalid_snan instead of recomputing the snan-ness of the operand. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-34-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: Add helper for frsqrtesRichard Henderson1-0/+19
There is no double-rounding bug here, because the result is merely an estimate to within 1 part in 32, but perform the operation with float64r32_div for consistency. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-33-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: Add helper for fmulsRichard Henderson1-0/+12
Use float64r32_mul. Fixes a double-rounding issue with performing the compuation in float64 and then rounding afterward. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-32-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2021-12-17target/ppc: Add helpers for fadds, fsubs, fdivsRichard Henderson1-0/+40
Use float64r32_{add,sub,div}. Fixes a double-rounding issue with performing the compuation in float64 and then rounding afterward. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-31-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>