diff options
author | Wilco Dijkstra <wdijkstr@arm.com> | 2018-05-30 10:31:21 +0000 |
---|---|---|
committer | Wilco Dijkstra <wilco@gcc.gnu.org> | 2018-05-30 10:31:21 +0000 |
commit | 2eb2847ec54a3262f303f47697c5e5cbe3cc089d (patch) | |
tree | c82f3633a43f74e9269c0bd423729416b7d5048e /gcc | |
parent | 30522cdb1462ff8892d01429de3d73e1b5c7e919 (diff) | |
download | gcc-2eb2847ec54a3262f303f47697c5e5cbe3cc089d.zip gcc-2eb2847ec54a3262f303f47697c5e5cbe3cc089d.tar.gz gcc-2eb2847ec54a3262f303f47697c5e5cbe3cc089d.tar.bz2 |
[AArch64] Fix aarch64_ira_change_pseudo_allocno_class
A recent commit removing '*' from the md files caused a large regression in
h264ref. It turns out aarch64_ira_change_pseudo_allocno_class is no longer
effective after the SVE changes, and the combination results in the regression.
This patch fixes it by explicitly checking for a subset of GENERAL_REGS and
FP_REGS. Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
gcc/
* config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
Check for subset of GENERAL_REGS and FP_REGS.
* config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase cost of
r=w alternative.
From-SVN: r260951
Diffstat (limited to 'gcc')
-rw-r--r-- | gcc/ChangeLog | 7 | ||||
-rw-r--r-- | gcc/config/aarch64/aarch64-simd.md | 2 | ||||
-rw-r--r-- | gcc/config/aarch64/aarch64.c | 27 |
3 files changed, 23 insertions, 13 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 51958e8..ef0a71e 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2018-05-30 Wilco Dijkstra <wdijkstr@arm.com> + + * config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class): + Check for subset of GENERAL_REGS and FP_REGS. + * config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase cost of + r=w alternative. + 2018-05-30 Richard Sandiford <richard.sandiford@linaro.org> * alias.c (adjust_offset_for_component_ref): Use poly_int_tree_p diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index daaefda..9623869 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3022,7 +3022,7 @@ ;; is guaranteed so upper bits should be considered undefined. ;; RTL uses GCC vector extension indices throughout so flip only for assembly. (define_insn "aarch64_get_lane<mode>" - [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=r, w, Utv") + [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=?r, w, Utv") (vec_select:<VEL> (match_operand:VALL_F16 1 "register_operand" "w, w, w") (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))] diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index afc9185..1fbde46 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1087,16 +1087,17 @@ aarch64_err_no_fpadvsimd (machine_mode mode, const char *msg) } /* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS. - The register allocator chooses ALL_REGS if FP_REGS and GENERAL_REGS have - the same cost even if ALL_REGS has a much larger cost. ALL_REGS is also - used if the cost of both FP_REGS and GENERAL_REGS is lower than the memory - cost (in this case the best class is the lowest cost one). Using ALL_REGS - irrespectively of its cost results in bad allocations with many redundant - int<->FP moves which are expensive on various cores. - To avoid this we don't allow ALL_REGS as the allocno class, but force a - decision between FP_REGS and GENERAL_REGS. We use the allocno class if it - isn't ALL_REGS. Similarly, use the best class if it isn't ALL_REGS. - Otherwise set the allocno class depending on the mode. + The register allocator chooses POINTER_AND_FP_REGS if FP_REGS and + GENERAL_REGS have the same cost - even if POINTER_AND_FP_REGS has a much + higher cost. POINTER_AND_FP_REGS is also used if the cost of both FP_REGS + and GENERAL_REGS is lower than the memory cost (in this case the best class + is the lowest cost one). Using POINTER_AND_FP_REGS irrespectively of its + cost results in bad allocations with many redundant int<->FP moves which + are expensive on various cores. + To avoid this we don't allow POINTER_AND_FP_REGS as the allocno class, but + force a decision between FP_REGS and GENERAL_REGS. We use the allocno class + if it isn't POINTER_AND_FP_REGS. Similarly, use the best class if it isn't + POINTER_AND_FP_REGS. Otherwise set the allocno class depending on the mode. The result of this is that it is no longer inefficient to have a higher memory move cost than the register move cost. */ @@ -1107,10 +1108,12 @@ aarch64_ira_change_pseudo_allocno_class (int regno, reg_class_t allocno_class, { machine_mode mode; - if (allocno_class != ALL_REGS) + if (reg_class_subset_p (allocno_class, GENERAL_REGS) + || reg_class_subset_p (allocno_class, FP_REGS)) return allocno_class; - if (best_class != ALL_REGS) + if (reg_class_subset_p (best_class, GENERAL_REGS) + || reg_class_subset_p (best_class, FP_REGS)) return best_class; mode = PSEUDO_REGNO_MODE (regno); |