aboutsummaryrefslogtreecommitdiff
path: root/gcc
diff options
context:
space:
mode:
authorMichael Meissner <meissner@linux.vnet.ibm.com>2014-11-17 22:32:26 +0000
committerMichael Meissner <meissner@gcc.gnu.org>2014-11-17 22:32:26 +0000
commit25adc5d044047b926181ebcaf6284b39df8ee306 (patch)
tree5839553437c95da1cf37623bc65b65a198077cf9 /gcc
parent5a4e7cade9b48522942a62b1064a4bd6b02f95e0 (diff)
downloadgcc-25adc5d044047b926181ebcaf6284b39df8ee306.zip
gcc-25adc5d044047b926181ebcaf6284b39df8ee306.tar.gz
gcc-25adc5d044047b926181ebcaf6284b39df8ee306.tar.bz2
rs6000.c (RELOAD_REG_AND_M16): Add support for Altivec style vector loads that ignore the bottom 3 bits of the...
[gcc] 2014-11-17 Michael Meissner <meissner@linux.vnet.ibm.com> Ulrich Weigand <Ulrich.Weigand@de.ibm.com> * config/rs6000/rs6000.c (RELOAD_REG_AND_M16): Add support for Altivec style vector loads that ignore the bottom 3 bits of the address. (rs6000_debug_addr_mask): New function to print the addr_mask values if debugging. (rs6000_debug_print_mode): Call rs6000_debug_addr_mask to print out addr_mask. (rs6000_setup_reg_addr_masks): Add support for Altivec style vector loads that ignore the bottom 3 bits of the address. Allow pre-increment and pre-decrement on floating point, even if the -mupper-regs-{sf,df} options were used. (rs6000_init_hard_regno_mode_ok): Rework DFmode support if -mupper-regs-df. Add support for -mupper-regs-sf. Rearrange code placement for direct move support. (rs6000_option_override_internal): Add checks for -mupper-regs-df requiring -mvsx, and -mupper-regs-sf requiring -mpower8-vector. If -mupper-regs, set both -mupper-regs-sf and -mupper-regs-df, depending on the underlying cpu. (rs6000_secondary_reload_fail): Add ATTRIBUTE_NORETURN. (rs6000_secondary_reload_toc_costs): Helper function to identify costs of a TOC load for secondary reload support. (rs6000_secondary_reload_memory): Helper function for secondary reload, to determine if a particular memory operation is directly handled by the hardware, or if it needs support from secondary reload to create a valid address. (rs6000_secondary_reload): Rework code, to be clearer. If the appropriate -mupper-regs-{sf,df} is used, use FPR registers to reload scalar values, since the FPR registers have D-form addressing. Move most of the code handling memory to the function rs6000_secondary_reload_memory, and use the reg_addr structure to determine what type of address modes are supported. Print more debug information if -mdebug=addr. (rs6000_secondary_reload_inner): Rework entire function to be more general. Use the reg_addr bits to determine what type of addressing is supported. (rs6000_preferred_reload_class): Rework. Move constant handling into a single place. Prefer using FLOAT_REGS for scalar floating point. (rs6000_secondary_reload_class): Use a FPR register to move a value from an Altivec register to a GPR, and vice versa. Move VSX handling above traditional floating point. * config/rs6000/rs6000.md (mov<mode>_hardfloat, FMOVE32 case): Delete some spaces in the constraints. (DF->DF move peephole2): Disable if -mupper-regs-{sf,df} to allow using FPR registers to load/store an Altivec register for scalar floating point types. (SF->SF move peephole2): Likewise. (DFmode splitter): Add a define_split to move floating point constants to the constant pool before register allocation. Normally constants are put into the pool immediately, but -ffast-math delays putting them into the constant pool for the reciprocal approximation support. (SFmode splitter): Likewise. * config/rs6000/rs6000.opt (-mupper-regs-df): Make option public. (-mupper-regs-sf): Likewise. * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define __UPPER_REGS_DF__ if -mupper-regs-df. Define __UPPER_REGS_SF__ if -mupper-regs-sf. (-mupper-regs): New combination option that sets -mupper-regs-sf and -mupper-regs-df by default if the cpu supports the instructions. * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mupper-regs, -mupper-regs-sf, and -mupper-regs-df. * config/rs6000/predicates.md (memory_fp_constant): New predicate to return true if the operand is a floating point constant that must be put into the constant pool, before register allocation occurs. * config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Enable -mupper-regs-df by default. (ISA_2_7_MASKS_SERVER): Enable -mupper-regs-sf by default. (POWERPC_MASKS): Add -mupper-regs-{sf,df} as options set by the various -mcpu=... options. (power7 cpu): Enable -mupper-regs-df by default. * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mupper-regs. [gcc/testsuite] 2014-11-17 Michael Meissner <meissner@linux.vnet.ibm.com> * gcc.target/powerpc/p8vector-ldst.c: Rewrite to use 40 live floating point variables instead of using asm to test allocating values to the Altivec registers. * gcc.target/powerpc/upper-regs-sf.c: New -mupper-regs-sf and -mupper-regs-df tests. * gcc.target/powerpc/upper-regs-df.c: Likewise. * config/rs6000/predicates.md (memory_fp_constant): New predicate Co-Authored-By: Ulrich Weigand <uweigand@de.ibm.com> From-SVN: r217679
Diffstat (limited to 'gcc')
-rw-r--r--gcc/ChangeLog85
-rw-r--r--gcc/config/rs6000/predicates.md21
-rw-r--r--gcc/config/rs6000/rs6000-c.c4
-rw-r--r--gcc/config/rs6000/rs6000-cpus.def10
-rw-r--r--gcc/config/rs6000/rs6000.c1124
-rw-r--r--gcc/config/rs6000/rs6000.md23
-rw-r--r--gcc/config/rs6000/rs6000.opt10
-rw-r--r--gcc/doc/invoke.texi37
-rw-r--r--gcc/testsuite/ChangeLog12
-rw-r--r--gcc/testsuite/gcc.target/powerpc/p8vector-ldst.c631
-rw-r--r--gcc/testsuite/gcc.target/powerpc/upper-regs-df.c726
-rw-r--r--gcc/testsuite/gcc.target/powerpc/upper-regs-sf.c726
12 files changed, 2941 insertions, 468 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 36bf41f..bd856b5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,88 @@
+2014-11-17 Michael Meissner <meissner@linux.vnet.ibm.com>
+ Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
+
+ * config/rs6000/rs6000.c (RELOAD_REG_AND_M16): Add support for
+ Altivec style vector loads that ignore the bottom 3 bits of the
+ address.
+ (rs6000_debug_addr_mask): New function to print the addr_mask
+ values if debugging.
+ (rs6000_debug_print_mode): Call rs6000_debug_addr_mask to print
+ out addr_mask.
+ (rs6000_setup_reg_addr_masks): Add support for Altivec style
+ vector loads that ignore the bottom 3 bits of the address. Allow
+ pre-increment and pre-decrement on floating point, even if the
+ -mupper-regs-{sf,df} options were used.
+ (rs6000_init_hard_regno_mode_ok): Rework DFmode support if
+ -mupper-regs-df. Add support for -mupper-regs-sf. Rearrange code
+ placement for direct move support.
+ (rs6000_option_override_internal): Add checks for -mupper-regs-df
+ requiring -mvsx, and -mupper-regs-sf requiring -mpower8-vector.
+ If -mupper-regs, set both -mupper-regs-sf and -mupper-regs-df,
+ depending on the underlying cpu.
+ (rs6000_secondary_reload_fail): Add ATTRIBUTE_NORETURN.
+ (rs6000_secondary_reload_toc_costs): Helper function to identify
+ costs of a TOC load for secondary reload support.
+ (rs6000_secondary_reload_memory): Helper function for secondary
+ reload, to determine if a particular memory operation is directly
+ handled by the hardware, or if it needs support from secondary
+ reload to create a valid address.
+ (rs6000_secondary_reload): Rework code, to be clearer. If the
+ appropriate -mupper-regs-{sf,df} is used, use FPR registers to
+ reload scalar values, since the FPR registers have D-form
+ addressing. Move most of the code handling memory to the function
+ rs6000_secondary_reload_memory, and use the reg_addr structure to
+ determine what type of address modes are supported. Print more
+ debug information if -mdebug=addr.
+ (rs6000_secondary_reload_inner): Rework entire function to be more
+ general. Use the reg_addr bits to determine what type of
+ addressing is supported.
+ (rs6000_preferred_reload_class): Rework. Move constant handling
+ into a single place. Prefer using FLOAT_REGS for scalar floating
+ point.
+ (rs6000_secondary_reload_class): Use a FPR register to move a
+ value from an Altivec register to a GPR, and vice versa. Move VSX
+ handling above traditional floating point.
+
+ * config/rs6000/rs6000.md (mov<mode>_hardfloat, FMOVE32 case):
+ Delete some spaces in the constraints.
+ (DF->DF move peephole2): Disable if -mupper-regs-{sf,df} to
+ allow using FPR registers to load/store an Altivec register for
+ scalar floating point types.
+ (SF->SF move peephole2): Likewise.
+ (DFmode splitter): Add a define_split to move floating point
+ constants to the constant pool before register allocation.
+ Normally constants are put into the pool immediately, but
+ -ffast-math delays putting them into the constant pool for the
+ reciprocal approximation support.
+ (SFmode splitter): Likewise.
+
+ * config/rs6000/rs6000.opt (-mupper-regs-df): Make option public.
+ (-mupper-regs-sf): Likewise.
+
+ * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define
+ __UPPER_REGS_DF__ if -mupper-regs-df. Define __UPPER_REGS_SF__ if
+ -mupper-regs-sf.
+ (-mupper-regs): New combination option that sets -mupper-regs-sf
+ and -mupper-regs-df by default if the cpu supports the instructions.
+
+ * doc/invoke.texi (RS/6000 and PowerPC Options): Document
+ -mupper-regs, -mupper-regs-sf, and -mupper-regs-df.
+
+ * config/rs6000/predicates.md (memory_fp_constant): New predicate
+ to return true if the operand is a floating point constant that
+ must be put into the constant pool, before register allocation
+ occurs.
+
+ * config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Enable
+ -mupper-regs-df by default.
+ (ISA_2_7_MASKS_SERVER): Enable -mupper-regs-sf by default.
+ (POWERPC_MASKS): Add -mupper-regs-{sf,df} as options set by the
+ various -mcpu=... options.
+ (power7 cpu): Enable -mupper-regs-df by default.
+
+ * doc/invoke.texi (RS/6000 and PowerPC Options): Document
+ -mupper-regs.
+
2014-11-17 Zhouyi Zhou <yizhouzhou@ict.ac.cn>
* ira-conflicts.c (build_conflict_bit_table): Add the current
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 8abac7e..de7fa4e 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -521,6 +521,27 @@
}
})
+;; Return 1 if the operand must be loaded from memory. This is used by a
+;; define_split to insure constants get pushed to the constant pool before
+;; reload. If -ffast-math is used, easy_fp_constant will allow move insns to
+;; have constants in order not interfere with reciprocal estimation. However,
+;; with -mupper-regs support, these constants must be moved to the constant
+;; pool before register allocation.
+
+(define_predicate "memory_fp_constant"
+ (match_code "const_double")
+{
+ if (TARGET_VSX && op == CONST0_RTX (mode))
+ return 0;
+
+ if (!TARGET_HARD_FLOAT || !TARGET_FPRS
+ || (mode == SFmode && !TARGET_SINGLE_FLOAT)
+ || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
+ return 0;
+
+ return 1;
+})
+
;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a
;; vector register without using memory.
(define_predicate "easy_vector_constant"
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 8fa791e..0deeaf1 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -380,6 +380,10 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags,
rs6000_define_or_undefine_macro (define_p, "__QUAD_MEMORY_ATOMIC__");
if ((flags & OPTION_MASK_CRYPTO) != 0)
rs6000_define_or_undefine_macro (define_p, "__CRYPTO__");
+ if ((flags & OPTION_MASK_UPPER_REGS_DF) != 0)
+ rs6000_define_or_undefine_macro (define_p, "__UPPER_REGS_DF__");
+ if ((flags & OPTION_MASK_UPPER_REGS_SF) != 0)
+ rs6000_define_or_undefine_macro (define_p, "__UPPER_REGS_SF__");
/* options from the builtin masks. */
if ((bu_mask & RS6000_BTM_SPE) != 0)
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index b17fd0d..c1a7649 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -44,7 +44,8 @@
#define ISA_2_6_MASKS_SERVER (ISA_2_5_MASKS_SERVER \
| OPTION_MASK_POPCNTD \
| OPTION_MASK_ALTIVEC \
- | OPTION_MASK_VSX)
+ | OPTION_MASK_VSX \
+ | OPTION_MASK_UPPER_REGS_DF)
/* For now, don't provide an embedded version of ISA 2.07. */
#define ISA_2_7_MASKS_SERVER (ISA_2_6_MASKS_SERVER \
@@ -54,7 +55,8 @@
| OPTION_MASK_DIRECT_MOVE \
| OPTION_MASK_HTM \
| OPTION_MASK_QUAD_MEMORY \
- | OPTION_MASK_QUAD_MEMORY_ATOMIC)
+ | OPTION_MASK_QUAD_MEMORY_ATOMIC \
+ | OPTION_MASK_UPPER_REGS_SF)
#define POWERPC_7400_MASK (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC)
@@ -94,6 +96,8 @@
| OPTION_MASK_RECIP_PRECISION \
| OPTION_MASK_SOFT_FLOAT \
| OPTION_MASK_STRICT_ALIGN_OPTIONAL \
+ | OPTION_MASK_UPPER_REGS_DF \
+ | OPTION_MASK_UPPER_REGS_SF \
| OPTION_MASK_VSX \
| OPTION_MASK_VSX_TIMODE)
@@ -184,7 +188,7 @@ RS6000_CPU ("power6x", PROCESSOR_POWER6, MASK_POWERPC64 | MASK_PPC_GPOPT
RS6000_CPU ("power7", PROCESSOR_POWER7, /* Don't add MASK_ISEL by default */
POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
| MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
- | MASK_VSX | MASK_RECIP_PRECISION)
+ | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF)
RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, MASK_PPC_GFXOPT | MASK_POWERPC64)
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 336dd43..4f66840 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -394,6 +394,7 @@ typedef unsigned char addr_mask_type;
#define RELOAD_REG_OFFSET 0x08 /* Reg+offset addressing. */
#define RELOAD_REG_PRE_INCDEC 0x10 /* PRE_INC/PRE_DEC valid. */
#define RELOAD_REG_PRE_MODIFY 0x20 /* PRE_MODIFY valid. */
+#define RELOAD_REG_AND_M16 0x40 /* AND -16 addressing. */
/* Register type masks based on the type, of valid addressing modes. */
struct rs6000_reg_addr {
@@ -1928,6 +1929,54 @@ rs6000_debug_vector_unit (enum rs6000_vector v)
return ret;
}
+/* Inner function printing just the address mask for a particular reload
+ register class. */
+DEBUG_FUNCTION char *
+rs6000_debug_addr_mask (addr_mask_type mask, bool keep_spaces)
+{
+ static char ret[8];
+ char *p = ret;
+
+ if ((mask & RELOAD_REG_VALID) != 0)
+ *p++ = 'v';
+ else if (keep_spaces)
+ *p++ = ' ';
+
+ if ((mask & RELOAD_REG_MULTIPLE) != 0)
+ *p++ = 'm';
+ else if (keep_spaces)
+ *p++ = ' ';
+
+ if ((mask & RELOAD_REG_INDEXED) != 0)
+ *p++ = 'i';
+ else if (keep_spaces)
+ *p++ = ' ';
+
+ if ((mask & RELOAD_REG_OFFSET) != 0)
+ *p++ = 'o';
+ else if (keep_spaces)
+ *p++ = ' ';
+
+ if ((mask & RELOAD_REG_PRE_INCDEC) != 0)
+ *p++ = '+';
+ else if (keep_spaces)
+ *p++ = ' ';
+
+ if ((mask & RELOAD_REG_PRE_MODIFY) != 0)
+ *p++ = '+';
+ else if (keep_spaces)
+ *p++ = ' ';
+
+ if ((mask & RELOAD_REG_AND_M16) != 0)
+ *p++ = '&';
+ else if (keep_spaces)
+ *p++ = ' ';
+
+ *p = '\0';
+
+ return ret;
+}
+
/* Print the address masks in a human readble fashion. */
DEBUG_FUNCTION void
rs6000_debug_print_mode (ssize_t m)
@@ -1936,18 +1985,8 @@ rs6000_debug_print_mode (ssize_t m)
fprintf (stderr, "Mode: %-5s", GET_MODE_NAME (m));
for (rc = 0; rc < N_RELOAD_REG; rc++)
- {
- addr_mask_type mask = reg_addr[m].addr_mask[rc];
- fprintf (stderr,
- " %s: %c%c%c%c%c%c",
- reload_reg_map[rc].name,
- (mask & RELOAD_REG_VALID) != 0 ? 'v' : ' ',
- (mask & RELOAD_REG_MULTIPLE) != 0 ? 'm' : ' ',
- (mask & RELOAD_REG_INDEXED) != 0 ? 'i' : ' ',
- (mask & RELOAD_REG_OFFSET) != 0 ? 'o' : ' ',
- (mask & RELOAD_REG_PRE_INCDEC) != 0 ? '+' : ' ',
- (mask & RELOAD_REG_PRE_MODIFY) != 0 ? '+' : ' ');
- }
+ fprintf (stderr, " %s: %s", reload_reg_map[rc].name,
+ rs6000_debug_addr_mask (reg_addr[m].addr_mask[rc], true));
if (rs6000_vector_unit[m] != VECTOR_NONE
|| rs6000_vector_mem[m] != VECTOR_NONE
@@ -2423,9 +2462,7 @@ rs6000_setup_reg_addr_masks (void)
/* Figure out if we can do PRE_INC, PRE_DEC, or PRE_MODIFY
addressing. Restrict addressing on SPE for 64-bit types
because of the SUBREG hackery used to address 64-bit floats in
- '32-bit' GPRs. To simplify secondary reload, don't allow
- update forms on scalar floating point types that can go in the
- upper registers. */
+ '32-bit' GPRs. */
if (TARGET_UPDATE
&& (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR)
@@ -2433,8 +2470,7 @@ rs6000_setup_reg_addr_masks (void)
&& !VECTOR_MODE_P (m2)
&& !COMPLEX_MODE_P (m2)
&& !indexed_only_p
- && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8)
- && !reg_addr[m2].scalar_in_vmx_p)
+ && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8))
{
addr_mask |= RELOAD_REG_PRE_INCDEC;
@@ -2467,6 +2503,11 @@ rs6000_setup_reg_addr_masks (void)
&& (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR))
addr_mask |= RELOAD_REG_OFFSET;
+ /* VMX registers can do (REG & -16) and ((REG+REG) & -16)
+ addressing on 128-bit types. */
+ if (rc == RELOAD_REG_VMX && GET_MODE_SIZE (m2) == 16)
+ addr_mask |= RELOAD_REG_AND_M16;
+
reg_addr[m].addr_mask[rc] = addr_mask;
any_addr_mask |= addr_mask;
}
@@ -2633,13 +2674,19 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
rs6000_vector_align[V1TImode] = 128;
}
- /* DFmode, see if we want to use the VSX unit. */
+ /* DFmode, see if we want to use the VSX unit. Memory is handled
+ differently, so don't set rs6000_vector_mem. */
if (TARGET_VSX && TARGET_VSX_SCALAR_DOUBLE)
{
rs6000_vector_unit[DFmode] = VECTOR_VSX;
- rs6000_vector_mem[DFmode]
- = (TARGET_UPPER_REGS_DF ? VECTOR_VSX : VECTOR_NONE);
- rs6000_vector_align[DFmode] = align64;
+ rs6000_vector_align[DFmode] = 64;
+ }
+
+ /* SFmode, see if we want to use the VSX unit. */
+ if (TARGET_P8_VECTOR && TARGET_VSX_SCALAR_FLOAT)
+ {
+ rs6000_vector_unit[SFmode] = VECTOR_VSX;
+ rs6000_vector_align[SFmode] = 32;
}
/* Allow TImode in VSX register and set the VSX memory macros. */
@@ -2774,58 +2821,42 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
reg_addr[V4SFmode].reload_load = CODE_FOR_reload_v4sf_di_load;
reg_addr[V2DFmode].reload_store = CODE_FOR_reload_v2df_di_store;
reg_addr[V2DFmode].reload_load = CODE_FOR_reload_v2df_di_load;
- if (TARGET_VSX && TARGET_UPPER_REGS_DF)
- {
- reg_addr[DFmode].reload_store = CODE_FOR_reload_df_di_store;
- reg_addr[DFmode].reload_load = CODE_FOR_reload_df_di_load;
- reg_addr[DFmode].scalar_in_vmx_p = true;
- reg_addr[DDmode].reload_store = CODE_FOR_reload_dd_di_store;
- reg_addr[DDmode].reload_load = CODE_FOR_reload_dd_di_load;
- }
- if (TARGET_P8_VECTOR)
- {
- reg_addr[SFmode].reload_store = CODE_FOR_reload_sf_di_store;
- reg_addr[SFmode].reload_load = CODE_FOR_reload_sf_di_load;
- reg_addr[SDmode].reload_store = CODE_FOR_reload_sd_di_store;
- reg_addr[SDmode].reload_load = CODE_FOR_reload_sd_di_load;
- if (TARGET_UPPER_REGS_SF)
- reg_addr[SFmode].scalar_in_vmx_p = true;
- }
+ reg_addr[DFmode].reload_store = CODE_FOR_reload_df_di_store;
+ reg_addr[DFmode].reload_load = CODE_FOR_reload_df_di_load;
+ reg_addr[DDmode].reload_store = CODE_FOR_reload_dd_di_store;
+ reg_addr[DDmode].reload_load = CODE_FOR_reload_dd_di_load;
+ reg_addr[SFmode].reload_store = CODE_FOR_reload_sf_di_store;
+ reg_addr[SFmode].reload_load = CODE_FOR_reload_sf_di_load;
+ reg_addr[SDmode].reload_store = CODE_FOR_reload_sd_di_store;
+ reg_addr[SDmode].reload_load = CODE_FOR_reload_sd_di_load;
+
if (TARGET_VSX_TIMODE)
{
reg_addr[TImode].reload_store = CODE_FOR_reload_ti_di_store;
reg_addr[TImode].reload_load = CODE_FOR_reload_ti_di_load;
}
+
if (TARGET_DIRECT_MOVE)
{
- if (TARGET_POWERPC64)
- {
- reg_addr[TImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxti;
- reg_addr[V1TImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv1ti;
- reg_addr[V2DFmode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv2df;
- reg_addr[V2DImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv2di;
- reg_addr[V4SFmode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv4sf;
- reg_addr[V4SImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv4si;
- reg_addr[V8HImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv8hi;
- reg_addr[V16QImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv16qi;
- reg_addr[SFmode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxsf;
-
- reg_addr[TImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprti;
- reg_addr[V1TImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv1ti;
- reg_addr[V2DFmode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv2df;
- reg_addr[V2DImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv2di;
- reg_addr[V4SFmode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv4sf;
- reg_addr[V4SImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv4si;
- reg_addr[V8HImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv8hi;
- reg_addr[V16QImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv16qi;
- reg_addr[SFmode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprsf;
- }
- else
- {
- reg_addr[DImode].reload_fpr_gpr = CODE_FOR_reload_fpr_from_gprdi;
- reg_addr[DDmode].reload_fpr_gpr = CODE_FOR_reload_fpr_from_gprdd;
- reg_addr[DFmode].reload_fpr_gpr = CODE_FOR_reload_fpr_from_gprdf;
- }
+ reg_addr[TImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxti;
+ reg_addr[V1TImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv1ti;
+ reg_addr[V2DFmode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv2df;
+ reg_addr[V2DImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv2di;
+ reg_addr[V4SFmode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv4sf;
+ reg_addr[V4SImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv4si;
+ reg_addr[V8HImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv8hi;
+ reg_addr[V16QImode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxv16qi;
+ reg_addr[SFmode].reload_gpr_vsx = CODE_FOR_reload_gpr_from_vsxsf;
+
+ reg_addr[TImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprti;
+ reg_addr[V1TImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv1ti;
+ reg_addr[V2DFmode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv2df;
+ reg_addr[V2DImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv2di;
+ reg_addr[V4SFmode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv4sf;
+ reg_addr[V4SImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv4si;
+ reg_addr[V8HImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv8hi;
+ reg_addr[V16QImode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprv16qi;
+ reg_addr[SFmode].reload_vsx_gpr = CODE_FOR_reload_vsx_from_gprsf;
}
}
else
@@ -2844,29 +2875,34 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
reg_addr[V4SFmode].reload_load = CODE_FOR_reload_v4sf_si_load;
reg_addr[V2DFmode].reload_store = CODE_FOR_reload_v2df_si_store;
reg_addr[V2DFmode].reload_load = CODE_FOR_reload_v2df_si_load;
- if (TARGET_VSX && TARGET_UPPER_REGS_DF)
- {
- reg_addr[DFmode].reload_store = CODE_FOR_reload_df_si_store;
- reg_addr[DFmode].reload_load = CODE_FOR_reload_df_si_load;
- reg_addr[DFmode].scalar_in_vmx_p = true;
- reg_addr[DDmode].reload_store = CODE_FOR_reload_dd_si_store;
- reg_addr[DDmode].reload_load = CODE_FOR_reload_dd_si_load;
- }
- if (TARGET_P8_VECTOR)
- {
- reg_addr[SFmode].reload_store = CODE_FOR_reload_sf_si_store;
- reg_addr[SFmode].reload_load = CODE_FOR_reload_sf_si_load;
- reg_addr[SDmode].reload_store = CODE_FOR_reload_sd_si_store;
- reg_addr[SDmode].reload_load = CODE_FOR_reload_sd_si_load;
- if (TARGET_UPPER_REGS_SF)
- reg_addr[SFmode].scalar_in_vmx_p = true;
- }
+ reg_addr[DFmode].reload_store = CODE_FOR_reload_df_si_store;
+ reg_addr[DFmode].reload_load = CODE_FOR_reload_df_si_load;
+ reg_addr[DDmode].reload_store = CODE_FOR_reload_dd_si_store;
+ reg_addr[DDmode].reload_load = CODE_FOR_reload_dd_si_load;
+ reg_addr[SFmode].reload_store = CODE_FOR_reload_sf_si_store;
+ reg_addr[SFmode].reload_load = CODE_FOR_reload_sf_si_load;
+ reg_addr[SDmode].reload_store = CODE_FOR_reload_sd_si_store;
+ reg_addr[SDmode].reload_load = CODE_FOR_reload_sd_si_load;
+
if (TARGET_VSX_TIMODE)
{
reg_addr[TImode].reload_store = CODE_FOR_reload_ti_si_store;
reg_addr[TImode].reload_load = CODE_FOR_reload_ti_si_load;
}
+
+ if (TARGET_DIRECT_MOVE)
+ {
+ reg_addr[DImode].reload_fpr_gpr = CODE_FOR_reload_fpr_from_gprdi;
+ reg_addr[DDmode].reload_fpr_gpr = CODE_FOR_reload_fpr_from_gprdd;
+ reg_addr[DFmode].reload_fpr_gpr = CODE_FOR_reload_fpr_from_gprdf;
+ }
}
+
+ if (TARGET_UPPER_REGS_DF)
+ reg_addr[DFmode].scalar_in_vmx_p = true;
+
+ if (TARGET_UPPER_REGS_SF)
+ reg_addr[SFmode].scalar_in_vmx_p = true;
}
/* Precalculate HARD_REGNO_NREGS. */
@@ -3470,6 +3506,54 @@ rs6000_option_override_internal (bool global_init_p)
rs6000_isa_flags &= ~OPTION_MASK_DFP;
}
+ /* Allow an explicit -mupper-regs to set both -mupper-regs-df and
+ -mupper-regs-sf, depending on the cpu, unless the user explicitly also set
+ the individual option. */
+ if (TARGET_UPPER_REGS > 0)
+ {
+ if (TARGET_VSX
+ && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF))
+ {
+ rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DF;
+ rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
+ }
+ if (TARGET_P8_VECTOR
+ && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
+ {
+ rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_SF;
+ rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_SF;
+ }
+ }
+ else if (TARGET_UPPER_REGS == 0)
+ {
+ if (TARGET_VSX
+ && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF))
+ {
+ rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
+ rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
+ }
+ if (TARGET_P8_VECTOR
+ && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
+ {
+ rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_SF;
+ rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_SF;
+ }
+ }
+
+ if (TARGET_UPPER_REGS_DF && !TARGET_VSX)
+ {
+ if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF)
+ error ("-mupper-regs-df requires -mvsx");
+ rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
+ }
+
+ if (TARGET_UPPER_REGS_SF && !TARGET_P8_VECTOR)
+ {
+ if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF)
+ error ("-mupper-regs-sf requires -mpower8-vector");
+ rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_SF;
+ }
+
/* The quad memory instructions only works in 64-bit mode. In 32-bit mode,
silently turn off quad memory mode. */
if ((TARGET_QUAD_MEMORY || TARGET_QUAD_MEMORY_ATOMIC) && !TARGET_POWERPC64)
@@ -16401,6 +16485,278 @@ register_to_reg_type (rtx reg, bool *is_altivec)
return reg_class_to_reg_type[(int)rclass];
}
+/* Helper function to return the cost of adding a TOC entry address. */
+
+static inline int
+rs6000_secondary_reload_toc_costs (addr_mask_type addr_mask)
+{
+ int ret;
+
+ if (TARGET_CMODEL != CMODEL_SMALL)
+ ret = ((addr_mask & RELOAD_REG_OFFSET) == 0) ? 1 : 2;
+
+ else
+ ret = (TARGET_MINIMAL_TOC) ? 6 : 3;
+
+ return ret;
+}
+
+/* Helper function for rs6000_secondary_reload to determine whether the memory
+ address (ADDR) with a given register class (RCLASS) and machine mode (MODE)
+ needs reloading. Return negative if the memory is not handled by the memory
+ helper functions and to try a different reload method, 0 if no additional
+ instructions are need, and positive to give the extra cost for the
+ memory. */
+
+static int
+rs6000_secondary_reload_memory (rtx addr,
+ enum reg_class rclass,
+ enum machine_mode mode)
+{
+ int extra_cost = 0;
+ rtx reg, and_arg, plus_arg0, plus_arg1;
+ addr_mask_type addr_mask;
+ const char *type = NULL;
+ const char *fail_msg = NULL;
+
+ if (GPR_REG_CLASS_P (rclass))
+ addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_GPR];
+
+ else if (rclass == FLOAT_REGS)
+ addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_FPR];
+
+ else if (rclass == ALTIVEC_REGS)
+ addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_VMX];
+
+ /* For the combined VSX_REGS, turn off Altivec AND -16. */
+ else if (rclass == VSX_REGS)
+ addr_mask = (reg_addr[mode].addr_mask[RELOAD_REG_VMX]
+ & ~RELOAD_REG_AND_M16);
+
+ else
+ {
+ if (TARGET_DEBUG_ADDR)
+ fprintf (stderr,
+ "rs6000_secondary_reload_memory: mode = %s, class = %s, "
+ "class is not GPR, FPR, VMX\n",
+ GET_MODE_NAME (mode), reg_class_names[rclass]);
+
+ return -1;
+ }
+
+ /* If the register isn't valid in this register class, just return now. */
+ if ((addr_mask & RELOAD_REG_VALID) == 0)
+ {
+ if (TARGET_DEBUG_ADDR)
+ fprintf (stderr,
+ "rs6000_secondary_reload_memory: mode = %s, class = %s, "
+ "not valid in class\n",
+ GET_MODE_NAME (mode), reg_class_names[rclass]);
+
+ return -1;
+ }
+
+ switch (GET_CODE (addr))
+ {
+ /* Does the register class supports auto update forms for this mode? We
+ don't need a scratch register, since the powerpc only supports
+ PRE_INC, PRE_DEC, and PRE_MODIFY. */
+ case PRE_INC:
+ case PRE_DEC:
+ reg = XEXP (addr, 0);
+ if (!base_reg_operand (addr, GET_MODE (reg)))
+ {
+ fail_msg = "no base register #1";
+ extra_cost = -1;
+ }
+
+ else if ((addr_mask & RELOAD_REG_PRE_INCDEC) == 0)
+ {
+ extra_cost = 1;
+ type = "update";
+ }
+ break;
+
+ case PRE_MODIFY:
+ reg = XEXP (addr, 0);
+ plus_arg1 = XEXP (addr, 1);
+ if (!base_reg_operand (reg, GET_MODE (reg))
+ || GET_CODE (plus_arg1) != PLUS
+ || !rtx_equal_p (reg, XEXP (plus_arg1, 0)))
+ {
+ fail_msg = "bad PRE_MODIFY";
+ extra_cost = -1;
+ }
+
+ else if ((addr_mask & RELOAD_REG_PRE_MODIFY) == 0)
+ {
+ extra_cost = 1;
+ type = "update";
+ }
+ break;
+
+ /* Do we need to simulate AND -16 to clear the bottom address bits used
+ in VMX load/stores? Only allow the AND for vector sizes. */
+ case AND:
+ and_arg = XEXP (addr, 0);
+ if (GET_MODE_SIZE (mode) != 16
+ || GET_CODE (XEXP (addr, 1)) != CONST_INT
+ || INTVAL (XEXP (addr, 1)) != -16)
+ {
+ fail_msg = "bad Altivec AND #1";
+ extra_cost = -1;
+ }
+
+ if (rclass != ALTIVEC_REGS)
+ {
+ if (legitimate_indirect_address_p (and_arg, false))
+ extra_cost = 1;
+
+ else if (legitimate_indexed_address_p (and_arg, false))
+ extra_cost = 2;
+
+ else
+ {
+ fail_msg = "bad Altivec AND #2";
+ extra_cost = -1;
+ }
+
+ type = "and";
+ }
+ break;
+
+ /* If this is an indirect address, make sure it is a base register. */
+ case REG:
+ case SUBREG:
+ if (!legitimate_indirect_address_p (addr, false))
+ {
+ extra_cost = 1;
+ type = "move";
+ }
+ break;
+
+ /* If this is an indexed address, make sure the register class can handle
+ indexed addresses for this mode. */
+ case PLUS:
+ plus_arg0 = XEXP (addr, 0);
+ plus_arg1 = XEXP (addr, 1);
+
+ /* (plus (plus (reg) (constant)) (constant)) is generated during
+ push_reload processing, so handle it now. */
+ if (GET_CODE (plus_arg0) == PLUS && CONST_INT_P (plus_arg1))
+ {
+ if ((addr_mask & RELOAD_REG_OFFSET) == 0)
+ {
+ extra_cost = 1;
+ type = "offset";
+ }
+ }
+
+ else if (!base_reg_operand (plus_arg0, GET_MODE (plus_arg0)))
+ {
+ fail_msg = "no base register #2";
+ extra_cost = -1;
+ }
+
+ else if (int_reg_operand (plus_arg1, GET_MODE (plus_arg1)))
+ {
+ if ((addr_mask & RELOAD_REG_INDEXED) == 0
+ || !legitimate_indexed_address_p (addr, false))
+ {
+ extra_cost = 1;
+ type = "indexed";
+ }
+ }
+
+ /* Make sure the register class can handle offset addresses. */
+ else if (rs6000_legitimate_offset_address_p (mode, addr, false, true))
+ {
+ if ((addr_mask & RELOAD_REG_OFFSET) == 0)
+ {
+ extra_cost = 1;
+ type = "offset";
+ }
+ }
+
+ else
+ {
+ fail_msg = "bad PLUS";
+ extra_cost = -1;
+ }
+
+ break;
+
+ case LO_SUM:
+ if (!legitimate_lo_sum_address_p (mode, addr, false))
+ {
+ fail_msg = "bad LO_SUM";
+ extra_cost = -1;
+ }
+
+ if ((addr_mask & RELOAD_REG_OFFSET) == 0)
+ {
+ extra_cost = 1;
+ type = "lo_sum";
+ }
+ break;
+
+ /* Static addresses need to create a TOC entry. */
+ case CONST:
+ case SYMBOL_REF:
+ case LABEL_REF:
+ type = "address";
+ extra_cost = rs6000_secondary_reload_toc_costs (addr_mask);
+ break;
+
+ /* TOC references look like offsetable memory. */
+ case UNSPEC:
+ if (TARGET_CMODEL == CMODEL_SMALL || XINT (addr, 1) != UNSPEC_TOCREL)
+ {
+ fail_msg = "bad UNSPEC";
+ extra_cost = -1;
+ }
+
+ else if ((addr_mask & RELOAD_REG_OFFSET) == 0)
+ {
+ extra_cost = 1;
+ type = "toc reference";
+ }
+ break;
+
+ default:
+ {
+ fail_msg = "bad address";
+ extra_cost = -1;
+ }
+ }
+
+ if (TARGET_DEBUG_ADDR /* && extra_cost != 0 */)
+ {
+ if (extra_cost < 0)
+ fprintf (stderr,
+ "rs6000_secondary_reload_memory error: mode = %s, "
+ "class = %s, addr_mask = '%s', %s\n",
+ GET_MODE_NAME (mode),
+ reg_class_names[rclass],
+ rs6000_debug_addr_mask (addr_mask, false),
+ (fail_msg != NULL) ? fail_msg : "<bad address>");
+
+ else
+ fprintf (stderr,
+ "rs6000_secondary_reload_memory: mode = %s, class = %s, "
+ "addr_mask = '%s', extra cost = %d, %s\n",
+ GET_MODE_NAME (mode),
+ reg_class_names[rclass],
+ rs6000_debug_addr_mask (addr_mask, false),
+ extra_cost,
+ (type) ? type : "<none>");
+
+ debug_rtx (addr);
+ }
+
+ return extra_cost;
+}
+
/* Helper function for rs6000_secondary_reload to return true if a move to a
different register classe is really a simple move. */
@@ -16607,8 +16963,15 @@ rs6000_secondary_reload (bool in_p,
reg_class_t ret = ALL_REGS;
enum insn_code icode;
bool default_p = false;
+ bool done_p = false;
+
+ /* Allow subreg of memory before/during reload. */
+ bool memory_p = (MEM_P (x)
+ || (!reload_completed && GET_CODE (x) == SUBREG
+ && MEM_P (SUBREG_REG (x))));
sri->icode = CODE_FOR_nothing;
+ sri->extra_cost = 0;
icode = ((in_p)
? reg_addr[mode].reload_load
: reg_addr[mode].reload_store);
@@ -16632,121 +16995,54 @@ rs6000_secondary_reload (bool in_p,
{
icode = (enum insn_code)sri->icode;
default_p = false;
+ done_p = true;
ret = NO_REGS;
}
}
- /* Handle vector moves with reload helper functions. */
- if (ret == ALL_REGS && icode != CODE_FOR_nothing)
+ /* Make sure 0.0 is not reloaded or forced into memory. */
+ if (x == CONST0_RTX (mode) && VSX_REG_CLASS_P (rclass))
{
ret = NO_REGS;
- sri->icode = CODE_FOR_nothing;
- sri->extra_cost = 0;
+ default_p = false;
+ done_p = true;
+ }
- if (GET_CODE (x) == MEM)
- {
- rtx addr = XEXP (x, 0);
+ /* If this is a scalar floating point value and we want to load it into the
+ traditional Altivec registers, do it via a move via a traditional floating
+ point register. Also make sure that non-zero constants use a FPR. */
+ if (!done_p && reg_addr[mode].scalar_in_vmx_p
+ && (rclass == VSX_REGS || rclass == ALTIVEC_REGS)
+ && (memory_p || (GET_CODE (x) == CONST_DOUBLE)))
+ {
+ ret = FLOAT_REGS;
+ default_p = false;
+ done_p = true;
+ }
- /* Loads to and stores from gprs can do reg+offset, and wouldn't need
- an extra register in that case, but it would need an extra
- register if the addressing is reg+reg or (reg+reg)&(-16). Special
- case load/store quad. */
- if (rclass == GENERAL_REGS || rclass == BASE_REGS)
- {
- if (TARGET_POWERPC64 && TARGET_QUAD_MEMORY
- && GET_MODE_SIZE (mode) == 16
- && quad_memory_operand (x, mode))
- {
- sri->icode = icode;
- sri->extra_cost = 2;
- }
+ /* Handle reload of load/stores if we have reload helper functions. */
+ if (!done_p && icode != CODE_FOR_nothing && memory_p)
+ {
+ int extra_cost = rs6000_secondary_reload_memory (XEXP (x, 0), rclass,
+ mode);
- else if (!legitimate_indirect_address_p (addr, false)
- && !rs6000_legitimate_offset_address_p (PTImode, addr,
- false, true))
- {
- sri->icode = icode;
- /* account for splitting the loads, and converting the
- address from reg+reg to reg. */
- sri->extra_cost = (((TARGET_64BIT) ? 3 : 5)
- + ((GET_CODE (addr) == AND) ? 1 : 0));
- }
- }
- /* Allow scalar loads to/from the traditional floating point
- registers, even if VSX memory is set. */
- else if ((rclass == FLOAT_REGS || rclass == NO_REGS)
- && (GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8)
- && (legitimate_indirect_address_p (addr, false)
- || legitimate_indirect_address_p (addr, false)
- || rs6000_legitimate_offset_address_p (mode, addr,
- false, true)))
-
- ;
- /* Loads to and stores from vector registers can only do reg+reg
- addressing. Altivec registers can also do (reg+reg)&(-16). Allow
- scalar modes loading up the traditional floating point registers
- to use offset addresses. */
- else if (rclass == VSX_REGS || rclass == ALTIVEC_REGS
- || rclass == FLOAT_REGS || rclass == NO_REGS)
- {
- if (!VECTOR_MEM_ALTIVEC_P (mode)
- && GET_CODE (addr) == AND
- && GET_CODE (XEXP (addr, 1)) == CONST_INT
- && INTVAL (XEXP (addr, 1)) == -16
- && (legitimate_indirect_address_p (XEXP (addr, 0), false)
- || legitimate_indexed_address_p (XEXP (addr, 0), false)))
- {
- sri->icode = icode;
- sri->extra_cost = ((GET_CODE (XEXP (addr, 0)) == PLUS)
- ? 2 : 1);
- }
- else if (!legitimate_indirect_address_p (addr, false)
- && (rclass == NO_REGS
- || !legitimate_indexed_address_p (addr, false)))
- {
- sri->icode = icode;
- sri->extra_cost = 1;
- }
- else
- icode = CODE_FOR_nothing;
- }
- /* Any other loads, including to pseudo registers which haven't been
- assigned to a register yet, default to require a scratch
- register. */
- else
- {
- sri->icode = icode;
- sri->extra_cost = 2;
- }
- }
- else if (REG_P (x))
+ if (extra_cost >= 0)
{
- int regno = true_regnum (x);
-
- icode = CODE_FOR_nothing;
- if (regno < 0 || regno >= FIRST_PSEUDO_REGISTER)
- default_p = true;
- else
+ done_p = true;
+ ret = NO_REGS;
+ if (extra_cost > 0)
{
- enum reg_class xclass = REGNO_REG_CLASS (regno);
- enum rs6000_reg_type rtype1 = reg_class_to_reg_type[(int)rclass];
- enum rs6000_reg_type rtype2 = reg_class_to_reg_type[(int)xclass];
-
- /* If memory is needed, use default_secondary_reload to create the
- stack slot. */
- if (rtype1 != rtype2 || !IS_STD_REG_TYPE (rtype1))
- default_p = true;
- else
- ret = NO_REGS;
+ sri->extra_cost = extra_cost;
+ sri->icode = icode;
}
}
- else
- default_p = true;
}
- else if (TARGET_POWERPC64
- && reg_class_to_reg_type[(int)rclass] == GPR_REG_TYPE
- && MEM_P (x)
- && GET_MODE_SIZE (GET_MODE (x)) >= UNITS_PER_WORD)
+
+ /* Handle unaligned loads and stores of integer registers. */
+ if (!done_p && TARGET_POWERPC64
+ && reg_class_to_reg_type[(int)rclass] == GPR_REG_TYPE
+ && memory_p
+ && GET_MODE_SIZE (GET_MODE (x)) >= UNITS_PER_WORD)
{
rtx addr = XEXP (x, 0);
rtx off = address_offset (addr);
@@ -16775,6 +17071,7 @@ rs6000_secondary_reload (bool in_p,
sri->icode = CODE_FOR_reload_di_store;
sri->extra_cost = 2;
ret = NO_REGS;
+ done_p = true;
}
else
default_p = true;
@@ -16782,10 +17079,11 @@ rs6000_secondary_reload (bool in_p,
else
default_p = true;
}
- else if (!TARGET_POWERPC64
- && reg_class_to_reg_type[(int)rclass] == GPR_REG_TYPE
- && MEM_P (x)
- && GET_MODE_SIZE (GET_MODE (x)) > UNITS_PER_WORD)
+
+ if (!done_p && !TARGET_POWERPC64
+ && reg_class_to_reg_type[(int)rclass] == GPR_REG_TYPE
+ && memory_p
+ && GET_MODE_SIZE (GET_MODE (x)) > UNITS_PER_WORD)
{
rtx addr = XEXP (x, 0);
rtx off = address_offset (addr);
@@ -16821,6 +17119,7 @@ rs6000_secondary_reload (bool in_p,
sri->icode = CODE_FOR_reload_si_store;
sri->extra_cost = 2;
ret = NO_REGS;
+ done_p = true;
}
else
default_p = true;
@@ -16828,7 +17127,8 @@ rs6000_secondary_reload (bool in_p,
else
default_p = true;
}
- else
+
+ if (!done_p)
default_p = true;
if (default_p)
@@ -16846,15 +17146,20 @@ rs6000_secondary_reload (bool in_p,
reg_class_names[rclass],
GET_MODE_NAME (mode));
+ if (reload_completed)
+ fputs (", after reload", stderr);
+
+ if (!done_p)
+ fputs (", done_p not set", stderr);
+
if (default_p)
- fprintf (stderr, ", default secondary reload");
+ fputs (", default secondary reload", stderr);
if (sri->icode != CODE_FOR_nothing)
- fprintf (stderr, ", reload func = %s, extra cost = %d\n",
+ fprintf (stderr, ", reload func = %s, extra cost = %d",
insn_data[sri->icode].name, sri->extra_cost);
- else
- fprintf (stderr, "\n");
+ fputs ("\n", stderr);
debug_rtx (x);
}
@@ -16883,6 +17188,9 @@ rs6000_secondary_reload_trace (int line, rtx reg, rtx mem, rtx scratch,
debug_rtx (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set, clobber)));
}
+static void rs6000_secondary_reload_fail (int, rtx, rtx, rtx, bool)
+ ATTRIBUTE_NORETURN;
+
static void
rs6000_secondary_reload_fail (int line, rtx reg, rtx mem, rtx scratch,
bool store_p)
@@ -16891,209 +17199,148 @@ rs6000_secondary_reload_fail (int line, rtx reg, rtx mem, rtx scratch,
gcc_unreachable ();
}
-/* Fixup reload addresses for Altivec or VSX loads/stores to change SP+offset
- to SP+reg addressing. */
+/* Fixup reload addresses for values in GPR, FPR, and VMX registers that have
+ reload helper functions. These were identified in
+ rs6000_secondary_reload_memory, and if reload decided to use the secondary
+ reload, it calls the insns:
+ reload_<RELOAD:mode>_<P:mptrsize>_store
+ reload_<RELOAD:mode>_<P:mptrsize>_load
+
+ which in turn calls this function, to do whatever is necessary to create
+ valid addresses. */
void
rs6000_secondary_reload_inner (rtx reg, rtx mem, rtx scratch, bool store_p)
{
int regno = true_regnum (reg);
machine_mode mode = GET_MODE (reg);
- enum reg_class rclass;
+ addr_mask_type addr_mask;
rtx addr;
- rtx and_op2 = NULL_RTX;
- rtx addr_op1;
- rtx addr_op2;
- rtx scratch_or_premodify = scratch;
- rtx and_rtx;
+ rtx new_addr;
+ rtx op_reg, op0, op1;
+ rtx and_op;
rtx cc_clobber;
+ rtvec rv;
- if (TARGET_DEBUG_ADDR)
- rs6000_secondary_reload_trace (__LINE__, reg, mem, scratch, store_p);
+ if (regno < 0 || regno >= FIRST_PSEUDO_REGISTER || !MEM_P (mem)
+ || !base_reg_operand (scratch, GET_MODE (scratch)))
+ rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
+
+ if (IN_RANGE (regno, FIRST_GPR_REGNO, LAST_GPR_REGNO))
+ addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_GPR];
+
+ else if (IN_RANGE (regno, FIRST_FPR_REGNO, LAST_FPR_REGNO))
+ addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_FPR];
- if (regno < 0 || regno >= FIRST_PSEUDO_REGISTER)
+ else if (IN_RANGE (regno, FIRST_ALTIVEC_REGNO, LAST_ALTIVEC_REGNO))
+ addr_mask = reg_addr[mode].addr_mask[RELOAD_REG_VMX];
+
+ else
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
- if (GET_CODE (mem) != MEM)
+ /* Make sure the mode is valid in this register class. */
+ if ((addr_mask & RELOAD_REG_VALID) == 0)
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
- rclass = REGNO_REG_CLASS (regno);
- addr = find_replacement (&XEXP (mem, 0));
+ if (TARGET_DEBUG_ADDR)
+ rs6000_secondary_reload_trace (__LINE__, reg, mem, scratch, store_p);
- switch (rclass)
+ new_addr = addr = XEXP (mem, 0);
+ switch (GET_CODE (addr))
{
- /* GPRs can handle reg + small constant, all other addresses need to use
- the scratch register. */
- case GENERAL_REGS:
- case BASE_REGS:
- if (GET_CODE (addr) == AND)
+ /* Does the register class support auto update forms for this mode? If
+ not, do the update now. We don't need a scratch register, since the
+ powerpc only supports PRE_INC, PRE_DEC, and PRE_MODIFY. */
+ case PRE_INC:
+ case PRE_DEC:
+ op_reg = XEXP (addr, 0);
+ if (!base_reg_operand (op_reg, Pmode))
+ rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
+
+ if ((addr_mask & RELOAD_REG_PRE_INCDEC) == 0)
{
- and_op2 = XEXP (addr, 1);
- addr = find_replacement (&XEXP (addr, 0));
+ emit_insn (gen_add2_insn (op_reg, GEN_INT (GET_MODE_SIZE (mode))));
+ new_addr = op_reg;
}
+ break;
- if (GET_CODE (addr) == PRE_MODIFY)
- {
- scratch_or_premodify = find_replacement (&XEXP (addr, 0));
- if (!REG_P (scratch_or_premodify))
- rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
+ case PRE_MODIFY:
+ op0 = XEXP (addr, 0);
+ op1 = XEXP (addr, 1);
+ if (!base_reg_operand (op0, Pmode)
+ || GET_CODE (op1) != PLUS
+ || !rtx_equal_p (op0, XEXP (op1, 0)))
+ rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
- addr = find_replacement (&XEXP (addr, 1));
- if (GET_CODE (addr) != PLUS)
- rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
+ if ((addr_mask & RELOAD_REG_PRE_MODIFY) == 0)
+ {
+ emit_insn (gen_rtx_SET (VOIDmode, op0, op1));
+ new_addr = reg;
}
+ break;
- if (GET_CODE (addr) == PLUS
- && (and_op2 != NULL_RTX
- || !rs6000_legitimate_offset_address_p (PTImode, addr,
- false, true)))
+ /* Do we need to simulate AND -16 to clear the bottom address bits used
+ in VMX load/stores? */
+ case AND:
+ op0 = XEXP (addr, 0);
+ op1 = XEXP (addr, 1);
+ if ((addr_mask & RELOAD_REG_AND_M16) == 0)
{
- /* find_replacement already recurses into both operands of
- PLUS so we don't need to call it here. */
- addr_op1 = XEXP (addr, 0);
- addr_op2 = XEXP (addr, 1);
- if (!legitimate_indirect_address_p (addr_op1, false))
- rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
+ if (REG_P (op0) || GET_CODE (op0) == SUBREG)
+ op_reg = op0;
- if (!REG_P (addr_op2)
- && (GET_CODE (addr_op2) != CONST_INT
- || !satisfies_constraint_I (addr_op2)))
+ else if (GET_CODE (op1) == PLUS)
{
- if (TARGET_DEBUG_ADDR)
- {
- fprintf (stderr,
- "\nMove plus addr to register %s, mode = %s: ",
- rs6000_reg_names[REGNO (scratch)],
- GET_MODE_NAME (mode));
- debug_rtx (addr_op2);
- }
- rs6000_emit_move (scratch, addr_op2, Pmode);
- addr_op2 = scratch;
+ emit_insn (gen_rtx_SET (VOIDmode, scratch, op1));
+ op_reg = scratch;
}
- emit_insn (gen_rtx_SET (VOIDmode,
- scratch_or_premodify,
- gen_rtx_PLUS (Pmode,
- addr_op1,
- addr_op2)));
+ else
+ rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
- addr = scratch_or_premodify;
- scratch_or_premodify = scratch;
- }
- else if (!legitimate_indirect_address_p (addr, false)
- && !rs6000_legitimate_offset_address_p (PTImode, addr,
- false, true))
- {
- if (TARGET_DEBUG_ADDR)
- {
- fprintf (stderr, "\nMove addr to register %s, mode = %s: ",
- rs6000_reg_names[REGNO (scratch_or_premodify)],
- GET_MODE_NAME (mode));
- debug_rtx (addr);
- }
- rs6000_emit_move (scratch_or_premodify, addr, Pmode);
- addr = scratch_or_premodify;
- scratch_or_premodify = scratch;
+ and_op = gen_rtx_AND (GET_MODE (scratch), op_reg, op1);
+ cc_clobber = gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (CCmode));
+ rv = gen_rtvec (2, gen_rtx_SET (VOIDmode, scratch, and_op), cc_clobber);
+ emit_insn (gen_rtx_PARALLEL (VOIDmode, rv));
+ new_addr = scratch;
}
break;
- /* Float registers can do offset+reg addressing for scalar types. */
- case FLOAT_REGS:
- if (legitimate_indirect_address_p (addr, false) /* reg */
- || legitimate_indexed_address_p (addr, false) /* reg+reg */
- || ((GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8)
- && and_op2 == NULL_RTX
- && scratch_or_premodify == scratch
- && rs6000_legitimate_offset_address_p (mode, addr, false, false)))
- break;
-
- /* If this isn't a legacy floating point load/store, fall through to the
- VSX defaults. */
-
- /* VSX/Altivec registers can only handle reg+reg addressing. Move other
- addresses into a scratch register. */
- case VSX_REGS:
- case ALTIVEC_REGS:
-
- /* With float regs, we need to handle the AND ourselves, since we can't
- use the Altivec instruction with an implicit AND -16. Allow scalar
- loads to float registers to use reg+offset even if VSX. */
- if (GET_CODE (addr) == AND
- && (rclass != ALTIVEC_REGS || GET_MODE_SIZE (mode) != 16
- || GET_CODE (XEXP (addr, 1)) != CONST_INT
- || INTVAL (XEXP (addr, 1)) != -16
- || !VECTOR_MEM_ALTIVEC_P (mode)))
- {
- and_op2 = XEXP (addr, 1);
- addr = find_replacement (&XEXP (addr, 0));
- }
-
- /* If we aren't using a VSX load, save the PRE_MODIFY register and use it
- as the address later. */
- if (GET_CODE (addr) == PRE_MODIFY
- && ((ALTIVEC_OR_VSX_VECTOR_MODE (mode)
- && (rclass != FLOAT_REGS
- || (GET_MODE_SIZE (mode) != 4 && GET_MODE_SIZE (mode) != 8)))
- || and_op2 != NULL_RTX
- || !legitimate_indexed_address_p (XEXP (addr, 1), false)))
- {
- scratch_or_premodify = find_replacement (&XEXP (addr, 0));
- if (!legitimate_indirect_address_p (scratch_or_premodify, false))
- rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
-
- addr = find_replacement (&XEXP (addr, 1));
- if (GET_CODE (addr) != PLUS)
- rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
+ /* If this is an indirect address, make sure it is a base register. */
+ case REG:
+ case SUBREG:
+ if (!base_reg_operand (addr, GET_MODE (addr)))
+ {
+ emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
+ new_addr = scratch;
}
+ break;
- if (legitimate_indirect_address_p (addr, false) /* reg */
- || legitimate_indexed_address_p (addr, false) /* reg+reg */
- || (GET_CODE (addr) == AND /* Altivec memory */
- && rclass == ALTIVEC_REGS
- && GET_CODE (XEXP (addr, 1)) == CONST_INT
- && INTVAL (XEXP (addr, 1)) == -16
- && (legitimate_indirect_address_p (XEXP (addr, 0), false)
- || legitimate_indexed_address_p (XEXP (addr, 0), false))))
- ;
+ /* If this is an indexed address, make sure the register class can handle
+ indexed addresses for this mode. */
+ case PLUS:
+ op0 = XEXP (addr, 0);
+ op1 = XEXP (addr, 1);
+ if (!base_reg_operand (op0, Pmode))
+ rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
- else if (GET_CODE (addr) == PLUS)
+ else if (int_reg_operand (op1, Pmode))
{
- addr_op1 = XEXP (addr, 0);
- addr_op2 = XEXP (addr, 1);
- if (!REG_P (addr_op1))
- rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
-
- if (TARGET_DEBUG_ADDR)
+ if ((addr_mask & RELOAD_REG_INDEXED) == 0)
{
- fprintf (stderr, "\nMove plus addr to register %s, mode = %s: ",
- rs6000_reg_names[REGNO (scratch)], GET_MODE_NAME (mode));
- debug_rtx (addr_op2);
+ emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
+ new_addr = scratch;
}
- rs6000_emit_move (scratch, addr_op2, Pmode);
- emit_insn (gen_rtx_SET (VOIDmode,
- scratch_or_premodify,
- gen_rtx_PLUS (Pmode,
- addr_op1,
- scratch)));
- addr = scratch_or_premodify;
- scratch_or_premodify = scratch;
}
- else if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST
- || GET_CODE (addr) == CONST_INT || GET_CODE (addr) == LO_SUM
- || REG_P (addr))
+ /* Make sure the register class can handle offset addresses. */
+ else if (rs6000_legitimate_offset_address_p (mode, addr, false, true))
{
- if (TARGET_DEBUG_ADDR)
+ if ((addr_mask & RELOAD_REG_OFFSET) == 0)
{
- fprintf (stderr, "\nMove addr to register %s, mode = %s: ",
- rs6000_reg_names[REGNO (scratch_or_premodify)],
- GET_MODE_NAME (mode));
- debug_rtx (addr);
+ emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
+ new_addr = scratch;
}
-
- rs6000_emit_move (scratch_or_premodify, addr, Pmode);
- addr = scratch_or_premodify;
- scratch_or_premodify = scratch;
}
else
@@ -17101,55 +17348,56 @@ rs6000_secondary_reload_inner (rtx reg, rtx mem, rtx scratch, bool store_p)
break;
- default:
- rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
- }
-
- /* If the original address involved a pre-modify that we couldn't use the VSX
- memory instruction with update, and we haven't taken care of already,
- store the address in the pre-modify register and use that as the
- address. */
- if (scratch_or_premodify != scratch && scratch_or_premodify != addr)
- {
- emit_insn (gen_rtx_SET (VOIDmode, scratch_or_premodify, addr));
- addr = scratch_or_premodify;
- }
+ case LO_SUM:
+ op0 = XEXP (addr, 0);
+ op1 = XEXP (addr, 1);
+ if (!base_reg_operand (op0, Pmode))
+ rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
- /* If the original address involved an AND -16 and we couldn't use an ALTIVEC
- memory instruction, recreate the AND now, including the clobber which is
- generated by the general ANDSI3/ANDDI3 patterns for the
- andi. instruction. */
- if (and_op2 != NULL_RTX)
- {
- if (! legitimate_indirect_address_p (addr, false))
+ else if (int_reg_operand (op1, Pmode))
{
- emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
- addr = scratch;
+ if ((addr_mask & RELOAD_REG_INDEXED) == 0)
+ {
+ emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
+ new_addr = scratch;
+ }
}
- if (TARGET_DEBUG_ADDR)
+ /* Make sure the register class can handle offset addresses. */
+ else if (legitimate_lo_sum_address_p (mode, addr, false))
{
- fprintf (stderr, "\nAnd addr to register %s, mode = %s: ",
- rs6000_reg_names[REGNO (scratch)], GET_MODE_NAME (mode));
- debug_rtx (and_op2);
+ if ((addr_mask & RELOAD_REG_OFFSET) == 0)
+ {
+ emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
+ new_addr = scratch;
+ }
}
- and_rtx = gen_rtx_SET (VOIDmode,
- scratch,
- gen_rtx_AND (Pmode,
- addr,
- and_op2));
+ else
+ rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
+
+ break;
- cc_clobber = gen_rtx_CLOBBER (CCmode, gen_rtx_SCRATCH (CCmode));
- emit_insn (gen_rtx_PARALLEL (VOIDmode,
- gen_rtvec (2, and_rtx, cc_clobber)));
- addr = scratch;
+ case SYMBOL_REF:
+ case CONST:
+ case LABEL_REF:
+ if (TARGET_TOC)
+ emit_insn (gen_rtx_SET (VOIDmode, scratch,
+ create_TOC_reference (addr, scratch)));
+ else
+ rs6000_emit_move (scratch, addr, Pmode);
+
+ new_addr = scratch;
+ break;
+
+ default:
+ rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
}
/* Adjust the address if it changed. */
- if (addr != XEXP (mem, 0))
+ if (addr != new_addr)
{
- mem = replace_equiv_address_nv (mem, addr);
+ mem = replace_equiv_address_nv (mem, new_addr);
if (TARGET_DEBUG_ADDR)
fprintf (stderr, "\nrs6000_secondary_reload_inner, mem adjusted.\n");
}
@@ -17294,43 +17542,35 @@ static enum reg_class
rs6000_preferred_reload_class (rtx x, enum reg_class rclass)
{
machine_mode mode = GET_MODE (x);
+ bool is_constant = CONSTANT_P (x);
- if (TARGET_VSX && x == CONST0_RTX (mode) && VSX_REG_CLASS_P (rclass))
- return rclass;
-
- if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
- && (rclass == ALTIVEC_REGS || rclass == VSX_REGS)
- && easy_vector_constant (x, mode))
- return ALTIVEC_REGS;
-
- if ((CONSTANT_P (x) || GET_CODE (x) == PLUS))
+ /* Do VSX tests before handling traditional floaitng point registers. */
+ if (TARGET_VSX && VSX_REG_CLASS_P (rclass))
{
- if (reg_class_subset_p (GENERAL_REGS, rclass))
- return GENERAL_REGS;
- if (reg_class_subset_p (BASE_REGS, rclass))
- return BASE_REGS;
- return NO_REGS;
- }
+ if (is_constant)
+ {
+ /* Zero is always allowed in all VSX registers. */
+ if (x == CONST0_RTX (mode))
+ return rclass;
- if (GET_MODE_CLASS (mode) == MODE_INT && rclass == NON_SPECIAL_REGS)
- return GENERAL_REGS;
+ /* If this is a vector constant that can be formed with a few Altivec
+ instructions, we want altivec registers. */
+ if (GET_CODE (x) == CONST_VECTOR && easy_vector_constant (x, mode))
+ return ALTIVEC_REGS;
- /* For VSX, prefer the traditional registers for 64-bit values because we can
- use the non-VSX loads. Prefer the Altivec registers if Altivec is
- handling the vector operations (i.e. V16QI, V8HI, and V4SI), or if we
- prefer Altivec loads.. */
- if (rclass == VSX_REGS)
- {
- if (MEM_P (x) && reg_addr[mode].scalar_in_vmx_p)
- {
- rtx addr = XEXP (x, 0);
- if (rs6000_legitimate_offset_address_p (mode, addr, false, true)
- || legitimate_lo_sum_address_p (mode, addr, false))
- return FLOAT_REGS;
+ /* Force constant to memory. */
+ return NO_REGS;
}
- else if (GET_MODE_SIZE (mode) <= 8 && !reg_addr[mode].scalar_in_vmx_p)
+
+ /* If this is a scalar floating point value, prefer the traditional
+ floating point registers so that we can use D-form (register+offset)
+ addressing. */
+ if (GET_MODE_SIZE (mode) < 16)
return FLOAT_REGS;
+ /* Prefer the Altivec registers if Altivec is handling the vector
+ operations (i.e. V16QI, V8HI, and V4SI), or if we prefer Altivec
+ loads. */
if (VECTOR_UNIT_ALTIVEC_P (mode) || VECTOR_MEM_ALTIVEC_P (mode)
|| mode == V1TImode)
return ALTIVEC_REGS;
@@ -17338,6 +17578,18 @@ rs6000_preferred_reload_class (rtx x, enum reg_class rclass)
return rclass;
}
+ if (is_constant || GET_CODE (x) == PLUS)
+ {
+ if (reg_class_subset_p (GENERAL_REGS, rclass))
+ return GENERAL_REGS;
+ if (reg_class_subset_p (BASE_REGS, rclass))
+ return BASE_REGS;
+ return NO_REGS;
+ }
+
+ if (GET_MODE_CLASS (mode) == MODE_INT && rclass == NON_SPECIAL_REGS)
+ return GENERAL_REGS;
+
return rclass;
}
@@ -17457,30 +17709,34 @@ rs6000_secondary_reload_class (enum reg_class rclass, machine_mode mode,
else
regno = -1;
+ /* If we have VSX register moves, prefer moving scalar values between
+ Altivec registers and GPR by going via an FPR (and then via memory)
+ instead of reloading the secondary memory address for Altivec moves. */
+ if (TARGET_VSX
+ && GET_MODE_SIZE (mode) < 16
+ && (((rclass == GENERAL_REGS || rclass == BASE_REGS)
+ && (regno >= 0 && ALTIVEC_REGNO_P (regno)))
+ || ((rclass == VSX_REGS || rclass == ALTIVEC_REGS)
+ && (regno >= 0 && INT_REGNO_P (regno)))))
+ return FLOAT_REGS;
+
/* We can place anything into GENERAL_REGS and can put GENERAL_REGS
into anything. */
if (rclass == GENERAL_REGS || rclass == BASE_REGS
|| (regno >= 0 && INT_REGNO_P (regno)))
return NO_REGS;
+ /* Constants, memory, and VSX registers can go into VSX registers (both the
+ traditional floating point and the altivec registers). */
+ if (rclass == VSX_REGS
+ && (regno == -1 || VSX_REGNO_P (regno)))
+ return NO_REGS;
+
/* Constants, memory, and FP registers can go into FP registers. */
if ((regno == -1 || FP_REGNO_P (regno))
&& (rclass == FLOAT_REGS || rclass == NON_SPECIAL_REGS))
return (mode != SDmode || lra_in_progress) ? NO_REGS : GENERAL_REGS;
- /* Memory, and FP/altivec registers can go into fp/altivec registers under
- VSX. However, for scalar variables, use the traditional floating point
- registers so that we can use offset+register addressing. */
- if (TARGET_VSX
- && (regno == -1 || VSX_REGNO_P (regno))
- && VSX_REG_CLASS_P (rclass))
- {
- if (GET_MODE_SIZE (mode) < 16)
- return FLOAT_REGS;
-
- return NO_REGS;
- }
-
/* Memory, and AltiVec registers can go into AltiVec registers. */
if ((regno == -1 || ALTIVEC_REGNO_P (regno))
&& rclass == ALTIVEC_REGS)
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 4d58707..fe73acf 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7850,7 +7850,7 @@
(define_insn "mov<mode>_hardfloat"
[(set (match_operand:FMOVE32 0 "nonimmediate_operand" "=!r,!r,m,f,<f32_vsx>,<f32_vsx>,<f32_lr>,<f32_sm>,<f32_av>,Z,?<f32_dm>,?r,*c*l,!r,*h,!r,!r")
- (match_operand:FMOVE32 1 "input_operand" "r,m,r,f,<f32_vsx>,j,<f32_lm>,<f32_sr>,Z,<f32_av>,r,<f32_dm>,r, h, 0, G,Fn"))]
+ (match_operand:FMOVE32 1 "input_operand" "r,m,r,f,<f32_vsx>,j,<f32_lm>,<f32_sr>,Z,<f32_av>,r,<f32_dm>,r,h,0,G,Fn"))]
"(gpc_reg_operand (operands[0], <MODE>mode)
|| gpc_reg_operand (operands[1], <MODE>mode))
&& (TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT)"
@@ -8137,6 +8137,21 @@
{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
[(set_attr "length" "20,20,16")])
+;; If we are using -ffast-math, easy_fp_constant assumes all constants are
+;; 'easy' in order to allow for reciprocal estimation. Make sure the constant
+;; is in the constant pool before reload occurs. This simplifies accessing
+;; scalars in the traditional Altivec registers.
+
+(define_split
+ [(set (match_operand:SFDF 0 "register_operand" "")
+ (match_operand:SFDF 1 "memory_fp_constant" ""))]
+ "TARGET_<MODE>_FPR && flag_unsafe_math_optimizations
+ && !reload_in_progress && !reload_completed && !lra_in_progress"
+ [(set (match_dup 0) (match_dup 2))]
+{
+ operands[2] = validize_mem (force_const_mem (<MODE>mode, operands[1]));
+})
+
(define_expand "extenddftf2"
[(set (match_operand:TF 0 "nonimmediate_operand" "")
(float_extend:TF (match_operand:DF 1 "input_operand" "")))]
@@ -9816,12 +9831,15 @@
;; sequences, using get_attr_length here will smash the operands
;; array. Neither is there an early_cobbler_p predicate.
;; Disallow subregs for E500 so we don't munge frob_di_df_2.
+;; Also this optimization interferes with scalars going into
+;; altivec registers (the code does reloading through the FPRs).
(define_peephole2
[(set (match_operand:DF 0 "gpc_reg_operand" "")
(match_operand:DF 1 "any_operand" ""))
(set (match_operand:DF 2 "gpc_reg_operand" "")
(match_dup 0))]
"!(TARGET_E500_DOUBLE && GET_CODE (operands[2]) == SUBREG)
+ && !TARGET_UPPER_REGS_DF
&& peep2_reg_dead_p (2, operands[0])"
[(set (match_dup 2) (match_dup 1))])
@@ -9830,7 +9848,8 @@
(match_operand:SF 1 "any_operand" ""))
(set (match_operand:SF 2 "gpc_reg_operand" "")
(match_dup 0))]
- "peep2_reg_dead_p (2, operands[0])"
+ "!TARGET_UPPER_REGS_SF
+ && peep2_reg_dead_p (2, operands[0])"
[(set (match_dup 2) (match_dup 1))])
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 4d0d5e7..eb3e323 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -582,12 +582,16 @@ Target Report Var(rs6000_compat_align_parm) Init(0) Save
Generate aggregate parameter passing code with at most 64-bit alignment.
mupper-regs-df
-Target Undocumented Mask(UPPER_REGS_DF) Var(rs6000_isa_flags)
+Target Report Mask(UPPER_REGS_DF) Var(rs6000_isa_flags)
Allow double variables in upper registers with -mcpu=power7 or -mvsx
mupper-regs-sf
-Target Undocumented Mask(UPPER_REGS_SF) Var(rs6000_isa_flags)
-Allow float variables in upper registers with -mcpu=power8 or -mp8-vector
+Target Report Mask(UPPER_REGS_SF) Var(rs6000_isa_flags)
+Allow float variables in upper registers with -mcpu=power8 or -mpower8-vector
+
+mupper-regs
+Target Report Var(TARGET_UPPER_REGS) Init(-1) Save
+Allow float/double variables in upper registers if cpu allows it
moptimize-swaps
Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 93943cb..9846a73 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -940,7 +940,9 @@ See RS/6000 and PowerPC Options.
-mcrypto -mno-crypto -mdirect-move -mno-direct-move @gol
-mquad-memory -mno-quad-memory @gol
-mquad-memory-atomic -mno-quad-memory-atomic @gol
--mcompat-align-parm -mno-compat-align-parm}
+-mcompat-align-parm -mno-compat-align-parm @gol
+-mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol
+-mupper-regs -mno-upper-regs}
@emph{RX Options}
@gccoptlist{-m64bit-doubles -m32bit-doubles -fpu -nofpu@gol
@@ -19729,6 +19731,39 @@ Generate code that uses (does not use) the atomic quad word memory
instructions. The @option{-mquad-memory-atomic} option requires use of
64-bit mode.
+@item -mupper-regs-df
+@itemx -mno-upper-regs-df
+@opindex mupper-regs-df
+@opindex mno-upper-regs-df
+Generate code that uses (does not use) the scalar double precision
+instructions that target all 64 registers in the vector/scalar
+floating point register set that were added in version 2.06 of the
+PowerPC ISA. The @option{-mupper-regs-df} turned on by default if you
+use either of the @option{-mcpu=power7}, @option{-mcpu=power8}, or
+@option{-mvsx} options.
+
+@item -mupper-regs-sf
+@itemx -mno-upper-regs-sf
+@opindex mupper-regs-sf
+@opindex mno-upper-regs-sf
+Generate code that uses (does not use) the scalar single precision
+instructions that target all 64 registers in the vector/scalar
+floating point register set that were added in version 2.07 of the
+PowerPC ISA. The @option{-mupper-regs-sf} turned on by default if you
+use either of the @option{-mcpu=power8}, or @option{-mpower8-vector}
+options.
+
+@item -mupper-regs
+@itemx -mno-upper-regs
+@opindex mupper-regs
+@opindex mno-upper-regs
+Generate code that uses (does not use) the scalar
+instructions that target all 64 registers in the vector/scalar
+floating point register set, depending on the model of the machine.
+
+If the @option{-mno-upper-regs} option was used, it will turn off both
+@option{-mupper-regs-sf} and @option{-mupper-regs-df} options.
+
@item -mfloat-gprs=@var{yes/single/double/no}
@itemx -mfloat-gprs
@opindex mfloat-gprs
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index dd13e34..e6f11b4 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,15 @@
+2014-11-17 Michael Meissner <meissner@linux.vnet.ibm.com>
+
+ * gcc.target/powerpc/p8vector-ldst.c: Rewrite to use 40 live
+ floating point variables instead of using asm to test allocating
+ values to the Altivec registers.
+
+ * gcc.target/powerpc/upper-regs-sf.c: New -mupper-regs-sf and
+ -mupper-regs-df tests.
+ * gcc.target/powerpc/upper-regs-df.c: Likewise.
+
+ * config/rs6000/predicates.md (memory_fp_constant): New predicate
+
2014-11-17 Tom de Vries <tom@codesourcery.com>
* gcc.dg/pr43864-2.c: Add -ftree-tail-merge to dg-options.
diff --git a/gcc/testsuite/gcc.target/powerpc/p8vector-ldst.c b/gcc/testsuite/gcc.target/powerpc/p8vector-ldst.c
index fa25509..5da7388 100644
--- a/gcc/testsuite/gcc.target/powerpc/p8vector-ldst.c
+++ b/gcc/testsuite/gcc.target/powerpc/p8vector-ldst.c
@@ -1,43 +1,624 @@
-/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
/* { dg-require-effective-target powerpc_p8vector_ok } */
/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
/* { dg-options "-mcpu=power8 -O2 -mupper-regs-df -mupper-regs-sf" } */
-float load_sf (float *p)
+float
+load_store_sf (unsigned long num,
+ const float *from_ptr,
+ float *to_ptr,
+ const unsigned long *in_mask_ptr,
+ const unsigned long *out_mask_ptr)
{
- float f = *p;
- __asm__ ("# reg %x0" : "+v" (f));
- return f;
-}
+ float value00 = 0.0f;
+ float value01 = 0.0f;
+ float value02 = 0.0f;
+ float value03 = 0.0f;
+ float value04 = 0.0f;
+ float value05 = 0.0f;
+ float value06 = 0.0f;
+ float value07 = 0.0f;
+ float value08 = 0.0f;
+ float value09 = 0.0f;
+ float value10 = 0.0f;
+ float value11 = 0.0f;
+ float value12 = 0.0f;
+ float value13 = 0.0f;
+ float value14 = 0.0f;
+ float value15 = 0.0f;
+ float value16 = 0.0f;
+ float value17 = 0.0f;
+ float value18 = 0.0f;
+ float value19 = 0.0f;
+ float value20 = 0.0f;
+ float value21 = 0.0f;
+ float value22 = 0.0f;
+ float value23 = 0.0f;
+ float value24 = 0.0f;
+ float value25 = 0.0f;
+ float value26 = 0.0f;
+ float value27 = 0.0f;
+ float value28 = 0.0f;
+ float value29 = 0.0f;
+ float value30 = 0.0f;
+ float value31 = 0.0f;
+ float value32 = 0.0f;
+ float value33 = 0.0f;
+ float value34 = 0.0f;
+ float value35 = 0.0f;
+ float value36 = 0.0f;
+ float value37 = 0.0f;
+ float value38 = 0.0f;
+ float value39 = 0.0f;
+ unsigned long in_mask;
+ unsigned long out_mask;
+ unsigned long i;
-double load_df (double *p)
-{
- double d = *p;
- __asm__ ("# reg %x0" : "+v" (d));
- return d;
-}
+ for (i = 0; i < num; i++)
+ {
+ in_mask = *in_mask_ptr++;
+ if ((in_mask & (1L << 0)) != 0L)
+ value00 = *from_ptr++;
-double load_dfsf (float *p)
-{
- double d = (double) *p;
- __asm__ ("# reg %x0" : "+v" (d));
- return d;
-}
+ if ((in_mask & (1L << 1)) != 0L)
+ value01 = *from_ptr++;
-void store_sf (float *p, float f)
-{
- __asm__ ("# reg %x0" : "+v" (f));
- *p = f;
+ if ((in_mask & (1L << 2)) != 0L)
+ value02 = *from_ptr++;
+
+ if ((in_mask & (1L << 3)) != 0L)
+ value03 = *from_ptr++;
+
+ if ((in_mask & (1L << 4)) != 0L)
+ value04 = *from_ptr++;
+
+ if ((in_mask & (1L << 5)) != 0L)
+ value05 = *from_ptr++;
+
+ if ((in_mask & (1L << 6)) != 0L)
+ value06 = *from_ptr++;
+
+ if ((in_mask & (1L << 7)) != 0L)
+ value07 = *from_ptr++;
+
+ if ((in_mask & (1L << 8)) != 0L)
+ value08 = *from_ptr++;
+
+ if ((in_mask & (1L << 9)) != 0L)
+ value09 = *from_ptr++;
+
+ if ((in_mask & (1L << 10)) != 0L)
+ value10 = *from_ptr++;
+
+ if ((in_mask & (1L << 11)) != 0L)
+ value11 = *from_ptr++;
+
+ if ((in_mask & (1L << 12)) != 0L)
+ value12 = *from_ptr++;
+
+ if ((in_mask & (1L << 13)) != 0L)
+ value13 = *from_ptr++;
+
+ if ((in_mask & (1L << 14)) != 0L)
+ value14 = *from_ptr++;
+
+ if ((in_mask & (1L << 15)) != 0L)
+ value15 = *from_ptr++;
+
+ if ((in_mask & (1L << 16)) != 0L)
+ value16 = *from_ptr++;
+
+ if ((in_mask & (1L << 17)) != 0L)
+ value17 = *from_ptr++;
+
+ if ((in_mask & (1L << 18)) != 0L)
+ value18 = *from_ptr++;
+
+ if ((in_mask & (1L << 19)) != 0L)
+ value19 = *from_ptr++;
+
+ if ((in_mask & (1L << 20)) != 0L)
+ value20 = *from_ptr++;
+
+ if ((in_mask & (1L << 21)) != 0L)
+ value21 = *from_ptr++;
+
+ if ((in_mask & (1L << 22)) != 0L)
+ value22 = *from_ptr++;
+
+ if ((in_mask & (1L << 23)) != 0L)
+ value23 = *from_ptr++;
+
+ if ((in_mask & (1L << 24)) != 0L)
+ value24 = *from_ptr++;
+
+ if ((in_mask & (1L << 25)) != 0L)
+ value25 = *from_ptr++;
+
+ if ((in_mask & (1L << 26)) != 0L)
+ value26 = *from_ptr++;
+
+ if ((in_mask & (1L << 27)) != 0L)
+ value27 = *from_ptr++;
+
+ if ((in_mask & (1L << 28)) != 0L)
+ value28 = *from_ptr++;
+
+ if ((in_mask & (1L << 29)) != 0L)
+ value29 = *from_ptr++;
+
+ if ((in_mask & (1L << 30)) != 0L)
+ value30 = *from_ptr++;
+
+ if ((in_mask & (1L << 31)) != 0L)
+ value31 = *from_ptr++;
+
+ if ((in_mask & (1L << 32)) != 0L)
+ value32 = *from_ptr++;
+
+ if ((in_mask & (1L << 33)) != 0L)
+ value33 = *from_ptr++;
+
+ if ((in_mask & (1L << 34)) != 0L)
+ value34 = *from_ptr++;
+
+ if ((in_mask & (1L << 35)) != 0L)
+ value35 = *from_ptr++;
+
+ if ((in_mask & (1L << 36)) != 0L)
+ value36 = *from_ptr++;
+
+ if ((in_mask & (1L << 37)) != 0L)
+ value37 = *from_ptr++;
+
+ if ((in_mask & (1L << 38)) != 0L)
+ value38 = *from_ptr++;
+
+ if ((in_mask & (1L << 39)) != 0L)
+ value39 = *from_ptr++;
+
+ out_mask = *out_mask_ptr++;
+ if ((out_mask & (1L << 0)) != 0L)
+ *to_ptr++ = value00;
+
+ if ((out_mask & (1L << 1)) != 0L)
+ *to_ptr++ = value01;
+
+ if ((out_mask & (1L << 2)) != 0L)
+ *to_ptr++ = value02;
+
+ if ((out_mask & (1L << 3)) != 0L)
+ *to_ptr++ = value03;
+
+ if ((out_mask & (1L << 4)) != 0L)
+ *to_ptr++ = value04;
+
+ if ((out_mask & (1L << 5)) != 0L)
+ *to_ptr++ = value05;
+
+ if ((out_mask & (1L << 6)) != 0L)
+ *to_ptr++ = value06;
+
+ if ((out_mask & (1L << 7)) != 0L)
+ *to_ptr++ = value07;
+
+ if ((out_mask & (1L << 8)) != 0L)
+ *to_ptr++ = value08;
+
+ if ((out_mask & (1L << 9)) != 0L)
+ *to_ptr++ = value09;
+
+ if ((out_mask & (1L << 10)) != 0L)
+ *to_ptr++ = value10;
+
+ if ((out_mask & (1L << 11)) != 0L)
+ *to_ptr++ = value11;
+
+ if ((out_mask & (1L << 12)) != 0L)
+ *to_ptr++ = value12;
+
+ if ((out_mask & (1L << 13)) != 0L)
+ *to_ptr++ = value13;
+
+ if ((out_mask & (1L << 14)) != 0L)
+ *to_ptr++ = value14;
+
+ if ((out_mask & (1L << 15)) != 0L)
+ *to_ptr++ = value15;
+
+ if ((out_mask & (1L << 16)) != 0L)
+ *to_ptr++ = value16;
+
+ if ((out_mask & (1L << 17)) != 0L)
+ *to_ptr++ = value17;
+
+ if ((out_mask & (1L << 18)) != 0L)
+ *to_ptr++ = value18;
+
+ if ((out_mask & (1L << 19)) != 0L)
+ *to_ptr++ = value19;
+
+ if ((out_mask & (1L << 20)) != 0L)
+ *to_ptr++ = value20;
+
+ if ((out_mask & (1L << 21)) != 0L)
+ *to_ptr++ = value21;
+
+ if ((out_mask & (1L << 22)) != 0L)
+ *to_ptr++ = value22;
+
+ if ((out_mask & (1L << 23)) != 0L)
+ *to_ptr++ = value23;
+
+ if ((out_mask & (1L << 24)) != 0L)
+ *to_ptr++ = value24;
+
+ if ((out_mask & (1L << 25)) != 0L)
+ *to_ptr++ = value25;
+
+ if ((out_mask & (1L << 26)) != 0L)
+ *to_ptr++ = value26;
+
+ if ((out_mask & (1L << 27)) != 0L)
+ *to_ptr++ = value27;
+
+ if ((out_mask & (1L << 28)) != 0L)
+ *to_ptr++ = value28;
+
+ if ((out_mask & (1L << 29)) != 0L)
+ *to_ptr++ = value29;
+
+ if ((out_mask & (1L << 30)) != 0L)
+ *to_ptr++ = value30;
+
+ if ((out_mask & (1L << 31)) != 0L)
+ *to_ptr++ = value31;
+
+ if ((out_mask & (1L << 32)) != 0L)
+ *to_ptr++ = value32;
+
+ if ((out_mask & (1L << 33)) != 0L)
+ *to_ptr++ = value33;
+
+ if ((out_mask & (1L << 34)) != 0L)
+ *to_ptr++ = value34;
+
+ if ((out_mask & (1L << 35)) != 0L)
+ *to_ptr++ = value35;
+
+ if ((out_mask & (1L << 36)) != 0L)
+ *to_ptr++ = value36;
+
+ if ((out_mask & (1L << 37)) != 0L)
+ *to_ptr++ = value37;
+
+ if ((out_mask & (1L << 38)) != 0L)
+ *to_ptr++ = value38;
+
+ if ((out_mask & (1L << 39)) != 0L)
+ *to_ptr++ = value39;
+ }
+
+ return ( value00 + value01 + value02 + value03 + value04
+ + value05 + value06 + value07 + value08 + value09
+ + value10 + value11 + value12 + value13 + value14
+ + value15 + value16 + value17 + value18 + value19
+ + value20 + value21 + value22 + value23 + value24
+ + value25 + value26 + value27 + value28 + value29
+ + value30 + value31 + value32 + value33 + value34
+ + value35 + value36 + value37 + value38 + value39);
}
-void store_df (double *p, double d)
+double
+load_store_df (unsigned long num,
+ const double *from_ptr,
+ double *to_ptr,
+ const unsigned long *in_mask_ptr,
+ const unsigned long *out_mask_ptr)
{
- __asm__ ("# reg %x0" : "+v" (d));
- *p = d;
+ double value00 = 0.0;
+ double value01 = 0.0;
+ double value02 = 0.0;
+ double value03 = 0.0;
+ double value04 = 0.0;
+ double value05 = 0.0;
+ double value06 = 0.0;
+ double value07 = 0.0;
+ double value08 = 0.0;
+ double value09 = 0.0;
+ double value10 = 0.0;
+ double value11 = 0.0;
+ double value12 = 0.0;
+ double value13 = 0.0;
+ double value14 = 0.0;
+ double value15 = 0.0;
+ double value16 = 0.0;
+ double value17 = 0.0;
+ double value18 = 0.0;
+ double value19 = 0.0;
+ double value20 = 0.0;
+ double value21 = 0.0;
+ double value22 = 0.0;
+ double value23 = 0.0;
+ double value24 = 0.0;
+ double value25 = 0.0;
+ double value26 = 0.0;
+ double value27 = 0.0;
+ double value28 = 0.0;
+ double value29 = 0.0;
+ double value30 = 0.0;
+ double value31 = 0.0;
+ double value32 = 0.0;
+ double value33 = 0.0;
+ double value34 = 0.0;
+ double value35 = 0.0;
+ double value36 = 0.0;
+ double value37 = 0.0;
+ double value38 = 0.0;
+ double value39 = 0.0;
+ unsigned long in_mask;
+ unsigned long out_mask;
+ unsigned long i;
+
+ for (i = 0; i < num; i++)
+ {
+ in_mask = *in_mask_ptr++;
+ if ((in_mask & (1L << 0)) != 0L)
+ value00 = *from_ptr++;
+
+ if ((in_mask & (1L << 1)) != 0L)
+ value01 = *from_ptr++;
+
+ if ((in_mask & (1L << 2)) != 0L)
+ value02 = *from_ptr++;
+
+ if ((in_mask & (1L << 3)) != 0L)
+ value03 = *from_ptr++;
+
+ if ((in_mask & (1L << 4)) != 0L)
+ value04 = *from_ptr++;
+
+ if ((in_mask & (1L << 5)) != 0L)
+ value05 = *from_ptr++;
+
+ if ((in_mask & (1L << 6)) != 0L)
+ value06 = *from_ptr++;
+
+ if ((in_mask & (1L << 7)) != 0L)
+ value07 = *from_ptr++;
+
+ if ((in_mask & (1L << 8)) != 0L)
+ value08 = *from_ptr++;
+
+ if ((in_mask & (1L << 9)) != 0L)
+ value09 = *from_ptr++;
+
+ if ((in_mask & (1L << 10)) != 0L)
+ value10 = *from_ptr++;
+
+ if ((in_mask & (1L << 11)) != 0L)
+ value11 = *from_ptr++;
+
+ if ((in_mask & (1L << 12)) != 0L)
+ value12 = *from_ptr++;
+
+ if ((in_mask & (1L << 13)) != 0L)
+ value13 = *from_ptr++;
+
+ if ((in_mask & (1L << 14)) != 0L)
+ value14 = *from_ptr++;
+
+ if ((in_mask & (1L << 15)) != 0L)
+ value15 = *from_ptr++;
+
+ if ((in_mask & (1L << 16)) != 0L)
+ value16 = *from_ptr++;
+
+ if ((in_mask & (1L << 17)) != 0L)
+ value17 = *from_ptr++;
+
+ if ((in_mask & (1L << 18)) != 0L)
+ value18 = *from_ptr++;
+
+ if ((in_mask & (1L << 19)) != 0L)
+ value19 = *from_ptr++;
+
+ if ((in_mask & (1L << 20)) != 0L)
+ value20 = *from_ptr++;
+
+ if ((in_mask & (1L << 21)) != 0L)
+ value21 = *from_ptr++;
+
+ if ((in_mask & (1L << 22)) != 0L)
+ value22 = *from_ptr++;
+
+ if ((in_mask & (1L << 23)) != 0L)
+ value23 = *from_ptr++;
+
+ if ((in_mask & (1L << 24)) != 0L)
+ value24 = *from_ptr++;
+
+ if ((in_mask & (1L << 25)) != 0L)
+ value25 = *from_ptr++;
+
+ if ((in_mask & (1L << 26)) != 0L)
+ value26 = *from_ptr++;
+
+ if ((in_mask & (1L << 27)) != 0L)
+ value27 = *from_ptr++;
+
+ if ((in_mask & (1L << 28)) != 0L)
+ value28 = *from_ptr++;
+
+ if ((in_mask & (1L << 29)) != 0L)
+ value29 = *from_ptr++;
+
+ if ((in_mask & (1L << 30)) != 0L)
+ value30 = *from_ptr++;
+
+ if ((in_mask & (1L << 31)) != 0L)
+ value31 = *from_ptr++;
+
+ if ((in_mask & (1L << 32)) != 0L)
+ value32 = *from_ptr++;
+
+ if ((in_mask & (1L << 33)) != 0L)
+ value33 = *from_ptr++;
+
+ if ((in_mask & (1L << 34)) != 0L)
+ value34 = *from_ptr++;
+
+ if ((in_mask & (1L << 35)) != 0L)
+ value35 = *from_ptr++;
+
+ if ((in_mask & (1L << 36)) != 0L)
+ value36 = *from_ptr++;
+
+ if ((in_mask & (1L << 37)) != 0L)
+ value37 = *from_ptr++;
+
+ if ((in_mask & (1L << 38)) != 0L)
+ value38 = *from_ptr++;
+
+ if ((in_mask & (1L << 39)) != 0L)
+ value39 = *from_ptr++;
+
+ out_mask = *out_mask_ptr++;
+ if ((out_mask & (1L << 0)) != 0L)
+ *to_ptr++ = value00;
+
+ if ((out_mask & (1L << 1)) != 0L)
+ *to_ptr++ = value01;
+
+ if ((out_mask & (1L << 2)) != 0L)
+ *to_ptr++ = value02;
+
+ if ((out_mask & (1L << 3)) != 0L)
+ *to_ptr++ = value03;
+
+ if ((out_mask & (1L << 4)) != 0L)
+ *to_ptr++ = value04;
+
+ if ((out_mask & (1L << 5)) != 0L)
+ *to_ptr++ = value05;
+
+ if ((out_mask & (1L << 6)) != 0L)
+ *to_ptr++ = value06;
+
+ if ((out_mask & (1L << 7)) != 0L)
+ *to_ptr++ = value07;
+
+ if ((out_mask & (1L << 8)) != 0L)
+ *to_ptr++ = value08;
+
+ if ((out_mask & (1L << 9)) != 0L)
+ *to_ptr++ = value09;
+
+ if ((out_mask & (1L << 10)) != 0L)
+ *to_ptr++ = value10;
+
+ if ((out_mask & (1L << 11)) != 0L)
+ *to_ptr++ = value11;
+
+ if ((out_mask & (1L << 12)) != 0L)
+ *to_ptr++ = value12;
+
+ if ((out_mask & (1L << 13)) != 0L)
+ *to_ptr++ = value13;
+
+ if ((out_mask & (1L << 14)) != 0L)
+ *to_ptr++ = value14;
+
+ if ((out_mask & (1L << 15)) != 0L)
+ *to_ptr++ = value15;
+
+ if ((out_mask & (1L << 16)) != 0L)
+ *to_ptr++ = value16;
+
+ if ((out_mask & (1L << 17)) != 0L)
+ *to_ptr++ = value17;
+
+ if ((out_mask & (1L << 18)) != 0L)
+ *to_ptr++ = value18;
+
+ if ((out_mask & (1L << 19)) != 0L)
+ *to_ptr++ = value19;
+
+ if ((out_mask & (1L << 20)) != 0L)
+ *to_ptr++ = value20;
+
+ if ((out_mask & (1L << 21)) != 0L)
+ *to_ptr++ = value21;
+
+ if ((out_mask & (1L << 22)) != 0L)
+ *to_ptr++ = value22;
+
+ if ((out_mask & (1L << 23)) != 0L)
+ *to_ptr++ = value23;
+
+ if ((out_mask & (1L << 24)) != 0L)
+ *to_ptr++ = value24;
+
+ if ((out_mask & (1L << 25)) != 0L)
+ *to_ptr++ = value25;
+
+ if ((out_mask & (1L << 26)) != 0L)
+ *to_ptr++ = value26;
+
+ if ((out_mask & (1L << 27)) != 0L)
+ *to_ptr++ = value27;
+
+ if ((out_mask & (1L << 28)) != 0L)
+ *to_ptr++ = value28;
+
+ if ((out_mask & (1L << 29)) != 0L)
+ *to_ptr++ = value29;
+
+ if ((out_mask & (1L << 30)) != 0L)
+ *to_ptr++ = value30;
+
+ if ((out_mask & (1L << 31)) != 0L)
+ *to_ptr++ = value31;
+
+ if ((out_mask & (1L << 32)) != 0L)
+ *to_ptr++ = value32;
+
+ if ((out_mask & (1L << 33)) != 0L)
+ *to_ptr++ = value33;
+
+ if ((out_mask & (1L << 34)) != 0L)
+ *to_ptr++ = value34;
+
+ if ((out_mask & (1L << 35)) != 0L)
+ *to_ptr++ = value35;
+
+ if ((out_mask & (1L << 36)) != 0L)
+ *to_ptr++ = value36;
+
+ if ((out_mask & (1L << 37)) != 0L)
+ *to_ptr++ = value37;
+
+ if ((out_mask & (1L << 38)) != 0L)
+ *to_ptr++ = value38;
+
+ if ((out_mask & (1L << 39)) != 0L)
+ *to_ptr++ = value39;
+ }
+
+ return ( value00 + value01 + value02 + value03 + value04
+ + value05 + value06 + value07 + value08 + value09
+ + value10 + value11 + value12 + value13 + value14
+ + value15 + value16 + value17 + value18 + value19
+ + value20 + value21 + value22 + value23 + value24
+ + value25 + value26 + value27 + value28 + value29
+ + value30 + value31 + value32 + value33 + value34
+ + value35 + value36 + value37 + value38 + value39);
}
/* { dg-final { scan-assembler "lxsspx" } } */
/* { dg-final { scan-assembler "lxsdx" } } */
/* { dg-final { scan-assembler "stxsspx" } } */
/* { dg-final { scan-assembler "stxsdx" } } */
+/* { dg-final { scan-assembler "xsaddsp" } } */
+/* { dg-final { scan-assembler "xsadddp" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/upper-regs-df.c b/gcc/testsuite/gcc.target/powerpc/upper-regs-df.c
new file mode 100644
index 0000000..e3a284c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/upper-regs-df.c
@@ -0,0 +1,726 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power7 -O2 -mupper-regs-df" } */
+
+/* Test for the -mupper-regs-df option to make sure double values are allocated
+ to the Altivec registers as well as the traditional FPR registers. */
+
+#ifndef TYPE
+#define TYPE double
+#endif
+
+#ifndef MASK_TYPE
+#define MASK_TYPE unsigned long long
+#endif
+
+#define MASK_ONE ((MASK_TYPE)1)
+#define ZERO ((TYPE) 0.0)
+
+TYPE
+test_add (const MASK_TYPE *add_mask, const TYPE *add_values,
+ const MASK_TYPE *sub_mask, const TYPE *sub_values,
+ const MASK_TYPE *mul_mask, const TYPE *mul_values,
+ const MASK_TYPE *div_mask, const TYPE *div_values,
+ const MASK_TYPE *eq0_mask, int *eq0_ptr)
+{
+ TYPE value;
+ TYPE value00 = ZERO;
+ TYPE value01 = ZERO;
+ TYPE value02 = ZERO;
+ TYPE value03 = ZERO;
+ TYPE value04 = ZERO;
+ TYPE value05 = ZERO;
+ TYPE value06 = ZERO;
+ TYPE value07 = ZERO;
+ TYPE value08 = ZERO;
+ TYPE value09 = ZERO;
+ TYPE value10 = ZERO;
+ TYPE value11 = ZERO;
+ TYPE value12 = ZERO;
+ TYPE value13 = ZERO;
+ TYPE value14 = ZERO;
+ TYPE value15 = ZERO;
+ TYPE value16 = ZERO;
+ TYPE value17 = ZERO;
+ TYPE value18 = ZERO;
+ TYPE value19 = ZERO;
+ TYPE value20 = ZERO;
+ TYPE value21 = ZERO;
+ TYPE value22 = ZERO;
+ TYPE value23 = ZERO;
+ TYPE value24 = ZERO;
+ TYPE value25 = ZERO;
+ TYPE value26 = ZERO;
+ TYPE value27 = ZERO;
+ TYPE value28 = ZERO;
+ TYPE value29 = ZERO;
+ TYPE value30 = ZERO;
+ TYPE value31 = ZERO;
+ TYPE value32 = ZERO;
+ TYPE value33 = ZERO;
+ TYPE value34 = ZERO;
+ TYPE value35 = ZERO;
+ TYPE value36 = ZERO;
+ TYPE value37 = ZERO;
+ TYPE value38 = ZERO;
+ TYPE value39 = ZERO;
+ MASK_TYPE mask;
+ int eq0;
+
+ while ((mask = *add_mask++) != 0)
+ {
+ value = *add_values++;
+
+ __asm__ (" #reg %0" : "+d" (value));
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ value00 += value;
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ value01 += value;
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ value02 += value;
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ value03 += value;
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ value04 += value;
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ value05 += value;
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ value06 += value;
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ value07 += value;
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ value08 += value;
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ value09 += value;
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ value10 += value;
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ value11 += value;
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ value12 += value;
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ value13 += value;
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ value14 += value;
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ value15 += value;
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ value16 += value;
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ value17 += value;
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ value18 += value;
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ value19 += value;
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ value20 += value;
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ value21 += value;
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ value22 += value;
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ value23 += value;
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ value24 += value;
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ value25 += value;
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ value26 += value;
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ value27 += value;
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ value28 += value;
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ value29 += value;
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ value30 += value;
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ value31 += value;
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ value32 += value;
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ value33 += value;
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ value34 += value;
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ value35 += value;
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ value36 += value;
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ value37 += value;
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ value38 += value;
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ value39 += value;
+ }
+
+ while ((mask = *sub_mask++) != 0)
+ {
+ value = *sub_values++;
+
+ __asm__ (" #reg %0" : "+d" (value));
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ value00 -= value;
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ value01 -= value;
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ value02 -= value;
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ value03 -= value;
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ value04 -= value;
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ value05 -= value;
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ value06 -= value;
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ value07 -= value;
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ value08 -= value;
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ value09 -= value;
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ value10 -= value;
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ value11 -= value;
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ value12 -= value;
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ value13 -= value;
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ value14 -= value;
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ value15 -= value;
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ value16 -= value;
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ value17 -= value;
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ value18 -= value;
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ value19 -= value;
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ value20 -= value;
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ value21 -= value;
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ value22 -= value;
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ value23 -= value;
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ value24 -= value;
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ value25 -= value;
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ value26 -= value;
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ value27 -= value;
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ value28 -= value;
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ value29 -= value;
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ value30 -= value;
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ value31 -= value;
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ value32 -= value;
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ value33 -= value;
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ value34 -= value;
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ value35 -= value;
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ value36 -= value;
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ value37 -= value;
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ value38 -= value;
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ value39 -= value;
+ }
+
+ while ((mask = *mul_mask++) != 0)
+ {
+ value = *mul_values++;
+
+ __asm__ (" #reg %0" : "+d" (value));
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ value00 *= value;
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ value01 *= value;
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ value02 *= value;
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ value03 *= value;
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ value04 *= value;
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ value05 *= value;
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ value06 *= value;
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ value07 *= value;
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ value08 *= value;
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ value09 *= value;
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ value10 *= value;
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ value11 *= value;
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ value12 *= value;
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ value13 *= value;
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ value14 *= value;
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ value15 *= value;
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ value16 *= value;
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ value17 *= value;
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ value18 *= value;
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ value19 *= value;
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ value20 *= value;
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ value21 *= value;
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ value22 *= value;
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ value23 *= value;
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ value24 *= value;
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ value25 *= value;
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ value26 *= value;
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ value27 *= value;
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ value28 *= value;
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ value29 *= value;
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ value30 *= value;
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ value31 *= value;
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ value32 *= value;
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ value33 *= value;
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ value34 *= value;
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ value35 *= value;
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ value36 *= value;
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ value37 *= value;
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ value38 *= value;
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ value39 *= value;
+ }
+
+ while ((mask = *div_mask++) != 0)
+ {
+ value = *div_values++;
+
+ __asm__ (" #reg %0" : "+d" (value));
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ value00 /= value;
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ value01 /= value;
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ value02 /= value;
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ value03 /= value;
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ value04 /= value;
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ value05 /= value;
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ value06 /= value;
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ value07 /= value;
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ value08 /= value;
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ value09 /= value;
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ value10 /= value;
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ value11 /= value;
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ value12 /= value;
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ value13 /= value;
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ value14 /= value;
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ value15 /= value;
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ value16 /= value;
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ value17 /= value;
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ value18 /= value;
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ value19 /= value;
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ value20 /= value;
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ value21 /= value;
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ value22 /= value;
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ value23 /= value;
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ value24 /= value;
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ value25 /= value;
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ value26 /= value;
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ value27 /= value;
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ value28 /= value;
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ value29 /= value;
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ value30 /= value;
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ value31 /= value;
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ value32 /= value;
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ value33 /= value;
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ value34 /= value;
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ value35 /= value;
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ value36 /= value;
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ value37 /= value;
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ value38 /= value;
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ value39 /= value;
+ }
+
+ while ((mask = *eq0_mask++) != 0)
+ {
+ eq0 = 0;
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ eq0 |= (value00 == ZERO);
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ eq0 |= (value01 == ZERO);
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ eq0 |= (value02 == ZERO);
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ eq0 |= (value03 == ZERO);
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ eq0 |= (value04 == ZERO);
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ eq0 |= (value05 == ZERO);
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ eq0 |= (value06 == ZERO);
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ eq0 |= (value07 == ZERO);
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ eq0 |= (value08 == ZERO);
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ eq0 |= (value09 == ZERO);
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ eq0 |= (value10 == ZERO);
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ eq0 |= (value11 == ZERO);
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ eq0 |= (value12 == ZERO);
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ eq0 |= (value13 == ZERO);
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ eq0 |= (value14 == ZERO);
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ eq0 |= (value15 == ZERO);
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ eq0 |= (value16 == ZERO);
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ eq0 |= (value17 == ZERO);
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ eq0 |= (value18 == ZERO);
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ eq0 |= (value19 == ZERO);
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ eq0 |= (value20 == ZERO);
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ eq0 |= (value21 == ZERO);
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ eq0 |= (value22 == ZERO);
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ eq0 |= (value23 == ZERO);
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ eq0 |= (value24 == ZERO);
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ eq0 |= (value25 == ZERO);
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ eq0 |= (value26 == ZERO);
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ eq0 |= (value27 == ZERO);
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ eq0 |= (value28 == ZERO);
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ eq0 |= (value29 == ZERO);
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ eq0 |= (value30 == ZERO);
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ eq0 |= (value31 == ZERO);
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ eq0 |= (value32 == ZERO);
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ eq0 |= (value33 == ZERO);
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ eq0 |= (value34 == ZERO);
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ eq0 |= (value35 == ZERO);
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ eq0 |= (value36 == ZERO);
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ eq0 |= (value37 == ZERO);
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ eq0 |= (value38 == ZERO);
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ eq0 |= (value39 == ZERO);
+
+ *eq0_ptr++ = eq0;
+ }
+
+ return ( value00 + value01 + value02 + value03 + value04
+ + value05 + value06 + value07 + value08 + value09
+ + value10 + value11 + value12 + value13 + value14
+ + value15 + value16 + value17 + value18 + value19
+ + value20 + value21 + value22 + value23 + value24
+ + value25 + value26 + value27 + value28 + value29
+ + value30 + value31 + value32 + value33 + value34
+ + value35 + value36 + value37 + value38 + value39);
+}
+
+/* { dg-final { scan-assembler "fadd" } } */
+/* { dg-final { scan-assembler "fsub" } } */
+/* { dg-final { scan-assembler "fmul" } } */
+/* { dg-final { scan-assembler "fdiv" } } */
+/* { dg-final { scan-assembler "fcmpu" } } */
+/* { dg-final { scan-assembler "xsadddp" } } */
+/* { dg-final { scan-assembler "xssubdp" } } */
+/* { dg-final { scan-assembler "xsmuldp" } } */
+/* { dg-final { scan-assembler "xsdivdp" } } */
+/* { dg-final { scan-assembler "xscmpudp" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/upper-regs-sf.c b/gcc/testsuite/gcc.target/powerpc/upper-regs-sf.c
new file mode 100644
index 0000000..401b5c1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/upper-regs-sf.c
@@ -0,0 +1,726 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O2 -mupper-regs-df -mupper-regs-sf" } */
+
+/* Test for the -mupper-regs-df option to make sure double values are allocated
+ to the Altivec registers as well as the traditional FPR registers. */
+
+#ifndef TYPE
+#define TYPE float
+#endif
+
+#ifndef MASK_TYPE
+#define MASK_TYPE unsigned long long
+#endif
+
+#define MASK_ONE ((MASK_TYPE)1)
+#define ZERO ((TYPE) 0.0)
+
+TYPE
+test_add (const MASK_TYPE *add_mask, const TYPE *add_values,
+ const MASK_TYPE *sub_mask, const TYPE *sub_values,
+ const MASK_TYPE *mul_mask, const TYPE *mul_values,
+ const MASK_TYPE *div_mask, const TYPE *div_values,
+ const MASK_TYPE *eq0_mask, int *eq0_ptr)
+{
+ TYPE value;
+ TYPE value00 = ZERO;
+ TYPE value01 = ZERO;
+ TYPE value02 = ZERO;
+ TYPE value03 = ZERO;
+ TYPE value04 = ZERO;
+ TYPE value05 = ZERO;
+ TYPE value06 = ZERO;
+ TYPE value07 = ZERO;
+ TYPE value08 = ZERO;
+ TYPE value09 = ZERO;
+ TYPE value10 = ZERO;
+ TYPE value11 = ZERO;
+ TYPE value12 = ZERO;
+ TYPE value13 = ZERO;
+ TYPE value14 = ZERO;
+ TYPE value15 = ZERO;
+ TYPE value16 = ZERO;
+ TYPE value17 = ZERO;
+ TYPE value18 = ZERO;
+ TYPE value19 = ZERO;
+ TYPE value20 = ZERO;
+ TYPE value21 = ZERO;
+ TYPE value22 = ZERO;
+ TYPE value23 = ZERO;
+ TYPE value24 = ZERO;
+ TYPE value25 = ZERO;
+ TYPE value26 = ZERO;
+ TYPE value27 = ZERO;
+ TYPE value28 = ZERO;
+ TYPE value29 = ZERO;
+ TYPE value30 = ZERO;
+ TYPE value31 = ZERO;
+ TYPE value32 = ZERO;
+ TYPE value33 = ZERO;
+ TYPE value34 = ZERO;
+ TYPE value35 = ZERO;
+ TYPE value36 = ZERO;
+ TYPE value37 = ZERO;
+ TYPE value38 = ZERO;
+ TYPE value39 = ZERO;
+ MASK_TYPE mask;
+ int eq0;
+
+ while ((mask = *add_mask++) != 0)
+ {
+ value = *add_values++;
+
+ __asm__ (" #reg %0" : "+d" (value));
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ value00 += value;
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ value01 += value;
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ value02 += value;
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ value03 += value;
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ value04 += value;
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ value05 += value;
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ value06 += value;
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ value07 += value;
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ value08 += value;
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ value09 += value;
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ value10 += value;
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ value11 += value;
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ value12 += value;
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ value13 += value;
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ value14 += value;
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ value15 += value;
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ value16 += value;
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ value17 += value;
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ value18 += value;
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ value19 += value;
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ value20 += value;
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ value21 += value;
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ value22 += value;
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ value23 += value;
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ value24 += value;
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ value25 += value;
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ value26 += value;
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ value27 += value;
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ value28 += value;
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ value29 += value;
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ value30 += value;
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ value31 += value;
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ value32 += value;
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ value33 += value;
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ value34 += value;
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ value35 += value;
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ value36 += value;
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ value37 += value;
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ value38 += value;
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ value39 += value;
+ }
+
+ while ((mask = *sub_mask++) != 0)
+ {
+ value = *sub_values++;
+
+ __asm__ (" #reg %0" : "+d" (value));
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ value00 -= value;
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ value01 -= value;
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ value02 -= value;
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ value03 -= value;
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ value04 -= value;
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ value05 -= value;
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ value06 -= value;
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ value07 -= value;
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ value08 -= value;
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ value09 -= value;
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ value10 -= value;
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ value11 -= value;
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ value12 -= value;
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ value13 -= value;
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ value14 -= value;
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ value15 -= value;
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ value16 -= value;
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ value17 -= value;
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ value18 -= value;
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ value19 -= value;
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ value20 -= value;
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ value21 -= value;
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ value22 -= value;
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ value23 -= value;
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ value24 -= value;
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ value25 -= value;
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ value26 -= value;
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ value27 -= value;
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ value28 -= value;
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ value29 -= value;
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ value30 -= value;
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ value31 -= value;
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ value32 -= value;
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ value33 -= value;
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ value34 -= value;
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ value35 -= value;
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ value36 -= value;
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ value37 -= value;
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ value38 -= value;
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ value39 -= value;
+ }
+
+ while ((mask = *mul_mask++) != 0)
+ {
+ value = *mul_values++;
+
+ __asm__ (" #reg %0" : "+d" (value));
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ value00 *= value;
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ value01 *= value;
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ value02 *= value;
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ value03 *= value;
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ value04 *= value;
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ value05 *= value;
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ value06 *= value;
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ value07 *= value;
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ value08 *= value;
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ value09 *= value;
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ value10 *= value;
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ value11 *= value;
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ value12 *= value;
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ value13 *= value;
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ value14 *= value;
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ value15 *= value;
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ value16 *= value;
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ value17 *= value;
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ value18 *= value;
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ value19 *= value;
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ value20 *= value;
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ value21 *= value;
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ value22 *= value;
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ value23 *= value;
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ value24 *= value;
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ value25 *= value;
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ value26 *= value;
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ value27 *= value;
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ value28 *= value;
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ value29 *= value;
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ value30 *= value;
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ value31 *= value;
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ value32 *= value;
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ value33 *= value;
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ value34 *= value;
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ value35 *= value;
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ value36 *= value;
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ value37 *= value;
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ value38 *= value;
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ value39 *= value;
+ }
+
+ while ((mask = *div_mask++) != 0)
+ {
+ value = *div_values++;
+
+ __asm__ (" #reg %0" : "+d" (value));
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ value00 /= value;
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ value01 /= value;
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ value02 /= value;
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ value03 /= value;
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ value04 /= value;
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ value05 /= value;
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ value06 /= value;
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ value07 /= value;
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ value08 /= value;
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ value09 /= value;
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ value10 /= value;
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ value11 /= value;
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ value12 /= value;
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ value13 /= value;
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ value14 /= value;
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ value15 /= value;
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ value16 /= value;
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ value17 /= value;
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ value18 /= value;
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ value19 /= value;
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ value20 /= value;
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ value21 /= value;
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ value22 /= value;
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ value23 /= value;
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ value24 /= value;
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ value25 /= value;
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ value26 /= value;
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ value27 /= value;
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ value28 /= value;
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ value29 /= value;
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ value30 /= value;
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ value31 /= value;
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ value32 /= value;
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ value33 /= value;
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ value34 /= value;
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ value35 /= value;
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ value36 /= value;
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ value37 /= value;
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ value38 /= value;
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ value39 /= value;
+ }
+
+ while ((mask = *eq0_mask++) != 0)
+ {
+ eq0 = 0;
+
+ if ((mask & (MASK_ONE << 0)) != 0)
+ eq0 |= (value00 == ZERO);
+
+ if ((mask & (MASK_ONE << 1)) != 0)
+ eq0 |= (value01 == ZERO);
+
+ if ((mask & (MASK_ONE << 2)) != 0)
+ eq0 |= (value02 == ZERO);
+
+ if ((mask & (MASK_ONE << 3)) != 0)
+ eq0 |= (value03 == ZERO);
+
+ if ((mask & (MASK_ONE << 4)) != 0)
+ eq0 |= (value04 == ZERO);
+
+ if ((mask & (MASK_ONE << 5)) != 0)
+ eq0 |= (value05 == ZERO);
+
+ if ((mask & (MASK_ONE << 6)) != 0)
+ eq0 |= (value06 == ZERO);
+
+ if ((mask & (MASK_ONE << 7)) != 0)
+ eq0 |= (value07 == ZERO);
+
+ if ((mask & (MASK_ONE << 8)) != 0)
+ eq0 |= (value08 == ZERO);
+
+ if ((mask & (MASK_ONE << 9)) != 0)
+ eq0 |= (value09 == ZERO);
+
+ if ((mask & (MASK_ONE << 10)) != 0)
+ eq0 |= (value10 == ZERO);
+
+ if ((mask & (MASK_ONE << 11)) != 0)
+ eq0 |= (value11 == ZERO);
+
+ if ((mask & (MASK_ONE << 12)) != 0)
+ eq0 |= (value12 == ZERO);
+
+ if ((mask & (MASK_ONE << 13)) != 0)
+ eq0 |= (value13 == ZERO);
+
+ if ((mask & (MASK_ONE << 14)) != 0)
+ eq0 |= (value14 == ZERO);
+
+ if ((mask & (MASK_ONE << 15)) != 0)
+ eq0 |= (value15 == ZERO);
+
+ if ((mask & (MASK_ONE << 16)) != 0)
+ eq0 |= (value16 == ZERO);
+
+ if ((mask & (MASK_ONE << 17)) != 0)
+ eq0 |= (value17 == ZERO);
+
+ if ((mask & (MASK_ONE << 18)) != 0)
+ eq0 |= (value18 == ZERO);
+
+ if ((mask & (MASK_ONE << 19)) != 0)
+ eq0 |= (value19 == ZERO);
+
+ if ((mask & (MASK_ONE << 20)) != 0)
+ eq0 |= (value20 == ZERO);
+
+ if ((mask & (MASK_ONE << 21)) != 0)
+ eq0 |= (value21 == ZERO);
+
+ if ((mask & (MASK_ONE << 22)) != 0)
+ eq0 |= (value22 == ZERO);
+
+ if ((mask & (MASK_ONE << 23)) != 0)
+ eq0 |= (value23 == ZERO);
+
+ if ((mask & (MASK_ONE << 24)) != 0)
+ eq0 |= (value24 == ZERO);
+
+ if ((mask & (MASK_ONE << 25)) != 0)
+ eq0 |= (value25 == ZERO);
+
+ if ((mask & (MASK_ONE << 26)) != 0)
+ eq0 |= (value26 == ZERO);
+
+ if ((mask & (MASK_ONE << 27)) != 0)
+ eq0 |= (value27 == ZERO);
+
+ if ((mask & (MASK_ONE << 28)) != 0)
+ eq0 |= (value28 == ZERO);
+
+ if ((mask & (MASK_ONE << 29)) != 0)
+ eq0 |= (value29 == ZERO);
+
+ if ((mask & (MASK_ONE << 30)) != 0)
+ eq0 |= (value30 == ZERO);
+
+ if ((mask & (MASK_ONE << 31)) != 0)
+ eq0 |= (value31 == ZERO);
+
+ if ((mask & (MASK_ONE << 32)) != 0)
+ eq0 |= (value32 == ZERO);
+
+ if ((mask & (MASK_ONE << 33)) != 0)
+ eq0 |= (value33 == ZERO);
+
+ if ((mask & (MASK_ONE << 34)) != 0)
+ eq0 |= (value34 == ZERO);
+
+ if ((mask & (MASK_ONE << 35)) != 0)
+ eq0 |= (value35 == ZERO);
+
+ if ((mask & (MASK_ONE << 36)) != 0)
+ eq0 |= (value36 == ZERO);
+
+ if ((mask & (MASK_ONE << 37)) != 0)
+ eq0 |= (value37 == ZERO);
+
+ if ((mask & (MASK_ONE << 38)) != 0)
+ eq0 |= (value38 == ZERO);
+
+ if ((mask & (MASK_ONE << 39)) != 0)
+ eq0 |= (value39 == ZERO);
+
+ *eq0_ptr++ = eq0;
+ }
+
+ return ( value00 + value01 + value02 + value03 + value04
+ + value05 + value06 + value07 + value08 + value09
+ + value10 + value11 + value12 + value13 + value14
+ + value15 + value16 + value17 + value18 + value19
+ + value20 + value21 + value22 + value23 + value24
+ + value25 + value26 + value27 + value28 + value29
+ + value30 + value31 + value32 + value33 + value34
+ + value35 + value36 + value37 + value38 + value39);
+}
+
+/* { dg-final { scan-assembler "fadds" } } */
+/* { dg-final { scan-assembler "fsubs" } } */
+/* { dg-final { scan-assembler "fmuls" } } */
+/* { dg-final { scan-assembler "fdivs" } } */
+/* { dg-final { scan-assembler "fcmpu" } } */
+/* { dg-final { scan-assembler "xsaddsp" } } */
+/* { dg-final { scan-assembler "xssubsp" } } */
+/* { dg-final { scan-assembler "xsmulsp" } } */
+/* { dg-final { scan-assembler "xsdivsp" } } */
+/* { dg-final { scan-assembler "xscmpudp" } } */