From 13f44099bcc64ddb50a6dbd462bf79b258dfd02c Mon Sep 17 00:00:00 2001 From: Tamar Christina Date: Fri, 8 Jul 2022 07:37:20 +0100 Subject: middle-end: Use subregs to expand COMPLEX_EXPR to set the lowpart. When lowering COMPLEX_EXPR we currently emit two VEC_EXTRACTs. One for the lowpart and one for the highpart. The problem with this is that in RTL the lvalue of the RTX is the only thing tying the two instructions together. This means that e.g. combine is unable to try to combine the two instructions for setting the lowpart and highpart. For ISAs that have bit extract instructions we can eliminate one of the extracts if, and only if we're setting the entire complex number. This change changes the expand code when we're setting the entire complex number to generate a subreg for the lowpart instead of a vec_extract. This allows us to optimize sequences such as: _Complex int f(int a, int b) { _Complex int t = a + b * 1i; return t; } from: f: bfi x2, x0, 0, 32 bfi x2, x1, 32, 32 mov x0, x2 ret into: f: bfi x0, x1, 32, 32 ret I have also confirmed the codegen for x86_64 did not change. gcc/ChangeLog: * expmed.cc (store_bit_field_1): Add parameter that indicates if value is still undefined and if so emit a subreg move instead. (store_integral_bit_field): Likewise. (store_bit_field): Likewise. * expr.h (write_complex_part): Likewise. * expmed.h (store_bit_field): Add new parameter. * builtins.cc (expand_ifn_atomic_compare_exchange_into_call): Use new parameter. (expand_ifn_atomic_compare_exchange): Likewise. * calls.cc (store_unaligned_arguments_into_pseudos): Likewise. * emit-rtl.cc (validate_subreg): Likewise. * expr.cc (emit_group_store): Likewise. (copy_blkmode_from_reg): Likewise. (copy_blkmode_to_reg): Likewise. (clear_storage_hints): Likewise. (write_complex_part): Likewise. (emit_move_complex_parts): Likewise. (expand_assignment): Likewise. (store_expr): Likewise. (store_field): Likewise. (expand_expr_real_2): Likewise. * ifcvt.cc (noce_emit_move_insn): Likewise. * internal-fn.cc (expand_arith_set_overflow): Likewise. (expand_arith_overflow_result_store): Likewise. (expand_addsub_overflow): Likewise. (expand_neg_overflow): Likewise. (expand_mul_overflow): Likewise. (expand_arith_overflow): Likewise. gcc/testsuite/ChangeLog: * g++.target/aarch64/complex-init.C: New test. --- gcc/ifcvt.cc | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) (limited to 'gcc/ifcvt.cc') diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index 25aff38..2e8ab39 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -999,7 +999,8 @@ noce_emit_move_insn (rtx x, rtx y) } gcc_assert (start < (MEM_P (op) ? BITS_PER_UNIT : BITS_PER_WORD)); - store_bit_field (op, size, start, 0, 0, GET_MODE (x), y, false); + store_bit_field (op, size, start, 0, 0, GET_MODE (x), y, false, + false); return; } @@ -1056,7 +1057,7 @@ noce_emit_move_insn (rtx x, rtx y) outmode = GET_MODE (outer); bitpos = SUBREG_BYTE (outer) * BITS_PER_UNIT; store_bit_field (inner, GET_MODE_BITSIZE (outmode), bitpos, - 0, 0, outmode, y, false); + 0, 0, outmode, y, false, false); } /* Return the CC reg if it is used in COND. */ -- cgit v1.1