target.h (struct gcc_target): Update prototypes of needs_block_p and gen_spec_check.

2008-10-14 Andrey Belevantsev <abel@ispras.ru> Dmitry Melnik <dm@ispras.ru> Dmitry Zhurikhin <zhur@ispras.ru> Alexander Monakov <amonakov@ispras.ru> Maxim Kuvyrkov <maxim@codesourcery.com> * target.h (struct gcc_target): Update prototypes of needs_block_p and gen_spec_check. * haifa-sched.c (create_check_block_twin): Update calls to the above. * sel-sched.c (create_speculation_check): Likewise. * doc/tm.texi: Provide documentation for new target hooks. * config/ia64/ia64.c: Include sel-sched.h. Rewrite speculation hooks. (ia64_gen_spec_insn): Removed. (get_spec_check_gen_function, insn_can_be_in_speculative_p, ia64_gen_spec_check): New static functions. (ia64_alloc_sched_context, ia64_init_sched_context, ia64_set_sched_context, ia64_clear_sched_context, ia64_free_sched_context, ia64_get_insn_spec_ds, ia64_get_insn_checked_ds, ia64_skip_rtx_p): Declare functions. (ia64_needs_block_p): Change prototype. (ia64_gen_check): Rename to ia64_gen_spec_check. (ia64_adjust_cost): Rename to ia64_adjust_cost_2. Add new parameter into declaration, add special memory dependencies handling. (TARGET_SCHED_ALLOC_SCHED_CONTEXT, TARGET_SCHED_INIT_SCHED_CONTEXT, TARGET_SCHED_SET_SCHED_CONTEXT, TARGET_SCHED_CLEAR_SCHED_CONTEXT, TARGET_SCHED_FREE_SCHED_CONTEXT, TARGET_SCHED_GET_INSN_SPEC_DS, TARGET_SCHED_GET_INSN_CHECKED_DS, TARGET_SCHED_SKIP_RTX_P): Define new target hooks. (TARGET_SCHED_GEN_CHECK): Rename to TARGET_SCHED_GEN_SPEC_CHECK. (ia64_optimization_options): Turn on selective scheduling with -O3, disable -fauto-inc-dec. Set mflag_sched_control_spec to true by default with selective scheduling. (ia64_override_options): Initialize align_loops and align_functions to 32 and 64, respectively. Set global selective scheduling flags according to target-dependent flags. (rtx_needs_barrier): Support UNSPEC_LDS_A. (group_barrier_needed): Use new mstop-bit-before-check flag. Add heuristic. (dfa_state_size): Make global. (spec_check_no, max_uid): Remove. (mem_ops_in_group, current_cycle): New variables. (ia64_sched_init): Disable checks for !SCHED_GROUP_P after reload. Initialize new variables. (is_load_p, record_memory_reference): New functions. (ia64_dfa_sched_reorder): Lower priority of loads when limit is reached. (ia64_variable_issue): Change use of current_sched_info to sched_deps_info. Update comment. Note if a load or a store is issued. (ia64_first_cycle_multipass_dfa_lookahead_guard_spec): Require a cycle advance if maximal number of loads or stores was issued on current cycle. (scheduled_good_insn): New static helper function. (ia64_dfa_new_cycle): Assert that last_scheduled_insn is set when a group barrier is needed. Fix vertical spacing. Guard the code doing state transition with last_scheduled_insn check. Mark that a stop bit should be before current insn if there was a cycle advance. Update current_cycle and mem_ops_in_group. (ia64_h_i_d_extended): Change use of current_sched_info to sched_deps_info. Reallocate stops_p by larger chunks. (struct _ia64_sched_context): New structure. (ia64_sched_context_t): New typedef. (ia64_alloc_sched_context, ia64_init_sched_context, ia64_set_sched_context, ia64_clear_sched_context, ia64_free_sched_context): New static functions. (gen_func_t): New typedef. (get_spec_load_gen_function): New function. (SPEC_GEN_EXTEND_OFFSET): Declare. (ia64_set_sched_flags): Check common_sched_info instead of *flags. (get_mode_no_for_insn): Change the condition that prevents use of special hardware registers so it can now handle pseudos. (get_spec_unspec_code): New function. (ia64_skip_rtx_p, get_insn_spec_code, ia64_get_insn_spec_ds, ia64_get_insn_checked_ds, ia64_gen_spec_load): New static functions. (ia64_speculate_insn, ia64_needs_block_p): Support branchy checks during selective scheduling. (ia64_speculate_insn): Use ds_get_speculation_types when determining whether we need to change the pattern. (SPEC_GEN_LD_MAP, SPEC_GEN_CHECK_OFFSET): Declare. (ia64_spec_check_src_p): Support new speculation/check codes. (struct bundle_state): New field. (issue_nops_and_insn): Initialize it. (insert_bundle_state): Minimize mid-bundle stop bits. (important_for_bundling_p): New function. (get_next_important_insn): Use important_for_bundling_p. (bundling): When shifting TImode from unimportant insns, ignore also group barriers. Assert that best state is found before the backward bundling pass. Print number of mid-bundle stop bits. Minimize mid-bundle stop bits. Check correct calculation of mid-bundle stop bits. (ia64_sched_finish, final_emit_insn_group_barriers): Fix formatting. (final_emit_insn_group_barriers): Emit stop bits before insns starting a new cycle. (sel2_run): New variable. (ia64_reorg): When flag_selective_scheduling2 is set, run the selective scheduling pass instead of schedule_ebbs. * config/ia64/ia64.md (speculable1, speculable2): New attributes. (UNSPEC_LDS_A): New UNSPEC. (movqi_internal, movhi_internal, movsi_internal, movdi_internal, movti_internal, movsf_internal, movdf_internal, movxf_internal): Make visible. Add speculable* attributes. (output_c_nc): New mode attribute. (mov<mode>_speculative_a, zero_extend<mode>di2_speculative_a, mov<mode>_nc, zero_extend<mode>di2_nc, advanced_load_check_nc_<mode>): New insns. (zero_extend*): Add speculable* attributes. * config/ia64/ia64.opt (msched_fp_mem_deps_zero_cost): New option. (msched-stop-bits-after-every-cycle): Likewise. (msched-max-memory-insns, msched-max-memory-insns-hard-limit): Likewise. (msched-spec-verbose): Remove. (msched-prefer-non-data-spec-insns, msched-prefer-non-control-spec-insns, msched-count-spec-in-critical-path, msel-sched-dont-check-control-spec): Use Target Report Var instead of Common Report Var. * config/ia64/itanium2.md: Remove incorrect bypass. * config/ia64/t-ia64 (ia64.o): Add dependency on sel-sched.h. Co-Authored-By: Alexander Monakov <amonakov@ispras.ru> Co-Authored-By: Dmitry Melnik <dm@ispras.ru> Co-Authored-By: Dmitry Zhurikhin <zhur@ispras.ru> Co-Authored-By: Maxim Kuvyrkov <maxim@codesourcery.com> From-SVN: r141108
author: Andrey Belevantsev <abel@ispras.ru> 2008-10-14 16:52:19 +0400
committer: Andrey Belevantsev <abel@gcc.gnu.org> 2008-10-14 16:52:19 +0400
commit: 388092d5b419ad5573c6dcd1c6478691e8417087 (patch)
tree: a119d58d157037691fbfb1607a5ac9c87bc3ca7d
parent: 847e63465e85f06389850301ba7632d4f0af6870 (diff)
download: gcc-388092d5b419ad5573c6dcd1c6478691e8417087.zip
gcc-388092d5b419ad5573c6dcd1c6478691e8417087.tar.gz
gcc-388092d5b419ad5573c6dcd1c6478691e8417087.tar.bz2
10 files changed, 1214 insertions, 437 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e246d8d..1fa7640 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,125 @@
+2008-10-14  Andrey Belevantsev  <abel@ispras.ru>
+	    Dmitry Melnik  <dm@ispras.ru>
+	    Dmitry Zhurikhin  <zhur@ispras.ru>
+	    Alexander Monakov  <amonakov@ispras.ru>
+	    Maxim Kuvyrkov  <maxim@codesourcery.com>
+
+	* target.h (struct gcc_target): Update prototypes of needs_block_p
+	and gen_spec_check.
+	* haifa-sched.c (create_check_block_twin): Update calls to the above.
+	* sel-sched.c (create_speculation_check): Likewise.
+	* doc/tm.texi: Provide documentation for new target hooks.
+		
+	* config/ia64/ia64.c: Include sel-sched.h.  Rewrite speculation hooks.
+	(ia64_gen_spec_insn): Removed.
+	(get_spec_check_gen_function, insn_can_be_in_speculative_p,
+	ia64_gen_spec_check): New static functions.
+	(ia64_alloc_sched_context, ia64_init_sched_context, 
+	ia64_set_sched_context, ia64_clear_sched_context, 
+	ia64_free_sched_context, ia64_get_insn_spec_ds, 
+	ia64_get_insn_checked_ds, ia64_skip_rtx_p): Declare functions.
+	(ia64_needs_block_p): Change prototype.
+	(ia64_gen_check): Rename to ia64_gen_spec_check.
+	(ia64_adjust_cost): Rename to ia64_adjust_cost_2.  Add new parameter 
+	into declaration, add special memory dependencies handling.
+	(TARGET_SCHED_ALLOC_SCHED_CONTEXT, TARGET_SCHED_INIT_SCHED_CONTEXT,
+	TARGET_SCHED_SET_SCHED_CONTEXT, TARGET_SCHED_CLEAR_SCHED_CONTEXT,
+	TARGET_SCHED_FREE_SCHED_CONTEXT, TARGET_SCHED_GET_INSN_SPEC_DS,
+	TARGET_SCHED_GET_INSN_CHECKED_DS, TARGET_SCHED_SKIP_RTX_P): 
+	Define new target hooks.
+	(TARGET_SCHED_GEN_CHECK): Rename to TARGET_SCHED_GEN_SPEC_CHECK.
+	(ia64_optimization_options): Turn on selective scheduling with -O3, 
+	disable -fauto-inc-dec. Set mflag_sched_control_spec to true by default
+	with selective scheduling.
+	(ia64_override_options): Initialize align_loops and align_functions 
+	to 32 and 64, respectively.  Set global selective scheduling flags 
+	according to target-dependent flags.
+	(rtx_needs_barrier): Support UNSPEC_LDS_A.
+	(group_barrier_needed): Use new mstop-bit-before-check flag. 
+	Add heuristic.
+	(dfa_state_size): Make global.
+	(spec_check_no, max_uid): Remove.
+        (mem_ops_in_group, current_cycle): New variables.
+	(ia64_sched_init): Disable checks for !SCHED_GROUP_P after reload.
+        Initialize new variables.
+        (is_load_p, record_memory_reference): New functions.
+        (ia64_dfa_sched_reorder): Lower priority of loads when limit is
+        reached.
+	(ia64_variable_issue): Change use of current_sched_info to 
+	sched_deps_info.  Update comment.  Note if a load or a store is issued.
+        (ia64_first_cycle_multipass_dfa_lookahead_guard_spec): Require a cycle
+        advance if maximal number of loads or stores was issued on current 
+        cycle.
+	(scheduled_good_insn): New static helper function.
+        (ia64_dfa_new_cycle): Assert that last_scheduled_insn is set when 
+        a group barrier is needed.  Fix vertical spacing.  Guard the code
+        doing state transition with last_scheduled_insn check.                
+        Mark that a stop bit should be before current insn if there was a 
+        cycle advance.  Update current_cycle and mem_ops_in_group.
+	(ia64_h_i_d_extended): Change use of current_sched_info to
+        sched_deps_info. Reallocate stops_p by larger chunks.
+	(struct _ia64_sched_context): New structure.
+	(ia64_sched_context_t): New typedef.
+	(ia64_alloc_sched_context, ia64_init_sched_context, 
+	ia64_set_sched_context, ia64_clear_sched_context,
+	ia64_free_sched_context): New static functions.
+	(gen_func_t): New typedef.
+	(get_spec_load_gen_function): New function.
+	(SPEC_GEN_EXTEND_OFFSET): Declare.	
+	(ia64_set_sched_flags): Check common_sched_info instead of *flags.
+	(get_mode_no_for_insn): Change the condition that prevents use of 
+	special hardware registers so it can now handle pseudos.
+	(get_spec_unspec_code): New function.
+	(ia64_skip_rtx_p, get_insn_spec_code, ia64_get_insn_spec_ds,
+	ia64_get_insn_checked_ds, ia64_gen_spec_load): New static functions.
+	(ia64_speculate_insn, ia64_needs_block_p): Support branchy checks
+	during selective scheduling.
+	(ia64_speculate_insn): Use ds_get_speculation_types when 
+	determining whether we need to change the pattern.
+	(SPEC_GEN_LD_MAP, SPEC_GEN_CHECK_OFFSET): Declare.
+	(ia64_spec_check_src_p): Support new speculation/check codes.
+	(struct bundle_state): New field.
+	(issue_nops_and_insn): Initialize it.
+	(insert_bundle_state): Minimize mid-bundle stop bits.
+	(important_for_bundling_p): New function.
+	(get_next_important_insn): Use important_for_bundling_p.
+	(bundling): When shifting TImode from unimportant insns, ignore 
+	also group barriers.  Assert that best state is found before 
+	the backward bundling pass.  Print number of mid-bundle stop bits.  
+	Minimize mid-bundle stop bits.  Check correct calculation of 
+	mid-bundle stop bits.
+	(ia64_sched_finish, final_emit_insn_group_barriers): Fix formatting.
+	(final_emit_insn_group_barriers): Emit stop bits before insns starting
+	a new cycle.
+	(sel2_run): New variable.
+	(ia64_reorg): When flag_selective_scheduling2 is set, run the selective 
+	scheduling pass instead of schedule_ebbs.  
+	
+	* config/ia64/ia64.md (speculable1, speculable2): New attributes.
+	(UNSPEC_LDS_A): New UNSPEC.
+	(movqi_internal, movhi_internal, movsi_internal, movdi_internal, 
+	movti_internal, movsf_internal, movdf_internal, 
+	movxf_internal): Make visible.  Add speculable* attributes.
+	(output_c_nc): New mode attribute.
+	(mov<mode>_speculative_a, zero_extend<mode>di2_speculative_a,
+	mov<mode>_nc, zero_extend<mode>di2_nc, 
+	advanced_load_check_nc_<mode>): New insns.
+	(zero_extend*): Add speculable* attributes.
+
+	* config/ia64/ia64.opt (msched_fp_mem_deps_zero_cost): New option.
+	(msched-stop-bits-after-every-cycle): Likewise.
+	(msched-max-memory-insns,
+        msched-max-memory-insns-hard-limit): Likewise.
+	(msched-spec-verbose): Remove.
+	(msched-prefer-non-data-spec-insns,
+        msched-prefer-non-control-spec-insns, msched-count-spec-in-critical-path,
+	msel-sched-dont-check-control-spec): Use Target 
+        Report Var instead of Common Report Var.
+
+	* config/ia64/itanium2.md: Remove incorrect bypass.
+	
+	* config/ia64/t-ia64 (ia64.o): Add dependency on sel-sched.h.
+
 2008-10-14  Jakub Jelinek  <jakub@redhat.com>
 
 	PR middle-end/37805
diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index 81767bb..34a37d2 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "dbgcnt.h"
 #include "tm-constrs.h"
+#include "sel-sched.h"
 
 /* This is used for communication between ASM_OUTPUT_LABEL and
    ASM_OUTPUT_LABELREF.  */
@@ -172,12 +173,19 @@ static int ia64_first_cycle_multipass_dfa_lookahead_guard (rtx);
 static bool ia64_first_cycle_multipass_dfa_lookahead_guard_spec (const_rtx);
 static int ia64_dfa_new_cycle (FILE *, int, rtx, int, int, int *);
 static void ia64_h_i_d_extended (void);
+static void * ia64_alloc_sched_context (void);
+static void ia64_init_sched_context (void *, bool);
+static void ia64_set_sched_context (void *);
+static void ia64_clear_sched_context (void *);
+static void ia64_free_sched_context (void *);
 static int ia64_mode_to_int (enum machine_mode);
 static void ia64_set_sched_flags (spec_info_t);
+static ds_t ia64_get_insn_spec_ds (rtx);
+static ds_t ia64_get_insn_checked_ds (rtx);
+static bool ia64_skip_rtx_p (const_rtx);
 static int ia64_speculate_insn (rtx, ds_t, rtx *);
-static rtx ia64_gen_spec_insn (rtx, ds_t, int, bool, bool);
-static bool ia64_needs_block_p (const_rtx);
-static rtx ia64_gen_check (rtx, rtx, bool);
+static bool ia64_needs_block_p (int);
+static rtx ia64_gen_spec_check (rtx, rtx, ds_t);
 static int ia64_spec_check_p (rtx);
 static int ia64_spec_check_src_p (rtx);
 static rtx gen_tls_get_addr (void);
@@ -222,7 +230,7 @@ static void ia64_output_function_epilogue (FILE *, HOST_WIDE_INT);
 static void ia64_output_function_end_prologue (FILE *);
 
 static int ia64_issue_rate (void);
-static int ia64_adjust_cost (rtx, rtx, rtx, int);
+static int ia64_adjust_cost_2 (rtx, int, rtx, int, dw_t);
 static void ia64_sched_init (FILE *, int, int);
 static void ia64_sched_init_global (FILE *, int, int);
 static void ia64_sched_finish_global (FILE *, int);
@@ -248,6 +256,7 @@ static int get_max_pos (state_t);
 static int get_template (state_t, int);
 
 static rtx get_next_important_insn (rtx, rtx);
+static bool important_for_bundling_p (rtx);
 static void bundling (FILE *, int, rtx, rtx);
 
 static void ia64_output_mi_thunk (FILE *, tree, HOST_WIDE_INT,
@@ -333,8 +342,8 @@ static const struct attribute_spec ia64_attribute_table[] =
 #undef TARGET_IN_SMALL_DATA_P
 #define TARGET_IN_SMALL_DATA_P  ia64_in_small_data_p
 
-#undef TARGET_SCHED_ADJUST_COST
-#define TARGET_SCHED_ADJUST_COST ia64_adjust_cost
+#undef TARGET_SCHED_ADJUST_COST_2
+#define TARGET_SCHED_ADJUST_COST_2 ia64_adjust_cost_2
 #undef TARGET_SCHED_ISSUE_RATE
 #define TARGET_SCHED_ISSUE_RATE ia64_issue_rate
 #undef TARGET_SCHED_VARIABLE_ISSUE
@@ -373,9 +382,30 @@ static const struct attribute_spec ia64_attribute_table[] =
 #undef TARGET_SCHED_H_I_D_EXTENDED
 #define TARGET_SCHED_H_I_D_EXTENDED ia64_h_i_d_extended
 
+#undef TARGET_SCHED_ALLOC_SCHED_CONTEXT
+#define TARGET_SCHED_ALLOC_SCHED_CONTEXT ia64_alloc_sched_context
+
+#undef TARGET_SCHED_INIT_SCHED_CONTEXT
+#define TARGET_SCHED_INIT_SCHED_CONTEXT ia64_init_sched_context
+
+#undef TARGET_SCHED_SET_SCHED_CONTEXT
+#define TARGET_SCHED_SET_SCHED_CONTEXT ia64_set_sched_context
+
+#undef TARGET_SCHED_CLEAR_SCHED_CONTEXT
+#define TARGET_SCHED_CLEAR_SCHED_CONTEXT ia64_clear_sched_context
+
+#undef TARGET_SCHED_FREE_SCHED_CONTEXT
+#define TARGET_SCHED_FREE_SCHED_CONTEXT ia64_free_sched_context
+
 #undef TARGET_SCHED_SET_SCHED_FLAGS
 #define TARGET_SCHED_SET_SCHED_FLAGS ia64_set_sched_flags
 
+#undef TARGET_SCHED_GET_INSN_SPEC_DS
+#define TARGET_SCHED_GET_INSN_SPEC_DS ia64_get_insn_spec_ds
+
+#undef TARGET_SCHED_GET_INSN_CHECKED_DS
+#define TARGET_SCHED_GET_INSN_CHECKED_DS ia64_get_insn_checked_ds
+
 #undef TARGET_SCHED_SPECULATE_INSN
 #define TARGET_SCHED_SPECULATE_INSN ia64_speculate_insn
 
@@ -383,12 +413,15 @@ static const struct attribute_spec ia64_attribute_table[] =
 #define TARGET_SCHED_NEEDS_BLOCK_P ia64_needs_block_p
 
 #undef TARGET_SCHED_GEN_SPEC_CHECK
-#define TARGET_SCHED_GEN_SPEC_CHECK ia64_gen_check
+#define TARGET_SCHED_GEN_SPEC_CHECK ia64_gen_spec_check
 
 #undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD_GUARD_SPEC
 #define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD_GUARD_SPEC\
   ia64_first_cycle_multipass_dfa_lookahead_guard_spec
 
+#undef TARGET_SCHED_SKIP_RTX_P
+#define TARGET_SCHED_SKIP_RTX_P ia64_skip_rtx_p
+
 #undef TARGET_FUNCTION_OK_FOR_SIBCALL
 #define TARGET_FUNCTION_OK_FOR_SIBCALL ia64_function_ok_for_sibcall
 #undef TARGET_ARG_PARTIAL_BYTES
@@ -5236,9 +5269,33 @@ ia64_override_options (void)
   ia64_flag_schedule_insns2 = flag_schedule_insns_after_reload;
   flag_schedule_insns_after_reload = 0;
 
+  if (optimize >= 3
+      && ! sel_sched_switch_set)
+    {
+      flag_selective_scheduling2 = 1;
+      flag_sel_sched_pipelining = 1;
+    }
+  if (mflag_sched_control_spec == 2)
+    {
+      /* Control speculation is on by default for the selective scheduler,
+         but not for the Haifa scheduler.  */
+      mflag_sched_control_spec = flag_selective_scheduling2 ? 1 : 0;
+    }
+  if (flag_sel_sched_pipelining && flag_auto_inc_dec)
+    {
+      /* FIXME: remove this when we'd implement breaking autoinsns as
+         a transformation.  */
+      flag_auto_inc_dec = 0;
+    }
+
   ia64_section_threshold = g_switch_set ? g_switch_value : IA64_DEFAULT_GVALUE;
 
   init_machine_status = ia64_init_machine_status;
+
+  if (align_functions <= 0)
+    align_functions = 64;
+  if (align_loops <= 0)
+    align_loops = 32;
 }
 
 /* Initialize the record of emitted frame related registers.  */
@@ -5893,6 +5950,7 @@ rtx_needs_barrier (rtx x, struct reg_flags flags, int pred)
 	case UNSPEC_FR_SQRT_RECIP_APPROX_RES:
 	case UNSPEC_LDA:
 	case UNSPEC_LDS:
+	case UNSPEC_LDS_A:
 	case UNSPEC_LDSA:
 	case UNSPEC_CHKACLR:
         case UNSPEC_CHKS:
@@ -6098,6 +6156,7 @@ group_barrier_needed (rtx insn)
 	 asm.  */
       if (! need_barrier)
 	need_barrier = rws_access_regno (REG_VOLATILE, flags, 0);
+
       break;
 
     default:
@@ -6296,14 +6355,6 @@ static state_t prev_cycle_state = NULL;
 
 static char *stops_p = NULL;
 
-/* The following array element values are ZERO for non-speculative
-   instructions and hold corresponding speculation check number for
-   speculative instructions.  */
-static int *spec_check_no = NULL;
-
-/* Size of spec_check_no array.  */
-static int max_uid = 0;
-
 /* The following variable is used to set up the mentioned above array.  */
 
 static int stop_before_p = 0;
@@ -6327,6 +6378,12 @@ static int *add_cycles;
 /* The following variable value is number of data speculations in progress.  */
 static int pending_data_specs = 0;
 
+/* Number of memory references on current and three future processor cycles.  */
+static char mem_ops_in_group[4];
+
+/* Number of current processor cycle (from scheduler's point of view).  */
+static int current_cycle;
+
 static rtx ia64_single_set (rtx);
 static void ia64_emit_insn_before (rtx, rtx);
 
@@ -6376,20 +6433,44 @@ ia64_single_set (rtx insn)
   return ret;
 }
 
-/* Adjust the cost of a scheduling dependency.  Return the new cost of
-   a dependency LINK or INSN on DEP_INSN.  COST is the current cost.  */
-
+/* Adjust the cost of a scheduling dependency.
+   Return the new cost of a dependency of type DEP_TYPE or INSN on DEP_INSN.
+   COST is the current cost, DW is dependency weakness.  */
 static int
-ia64_adjust_cost (rtx insn, rtx link, rtx dep_insn, int cost)
+ia64_adjust_cost_2 (rtx insn, int dep_type1, rtx dep_insn, int cost, dw_t dw)
 {
+  enum reg_note dep_type = (enum reg_note) dep_type1;
   enum attr_itanium_class dep_class;
   enum attr_itanium_class insn_class;
 
-  if (REG_NOTE_KIND (link) != REG_DEP_OUTPUT)
-    return cost;
-
   insn_class = ia64_safe_itanium_class (insn);
   dep_class = ia64_safe_itanium_class (dep_insn);
+
+  /* Treat true memory dependencies separately.  Ignore apparent true
+     dependence between store and call (call has a MEM inside a SYMBOL_REF).  */
+  if (dep_type == REG_DEP_TRUE
+      && (dep_class == ITANIUM_CLASS_ST || dep_class == ITANIUM_CLASS_STF)
+      && (insn_class == ITANIUM_CLASS_BR || insn_class == ITANIUM_CLASS_SCALL))
+    return 0;
+
+  if (dw == MIN_DEP_WEAK)
+    /* Store and load are likely to alias, use higher cost to avoid stall.  */
+    return PARAM_VALUE (PARAM_SCHED_MEM_TRUE_DEP_COST);
+  else if (dw > MIN_DEP_WEAK)
+    {
+      /* Store and load are less likely to alias.  */
+      if (mflag_sched_fp_mem_deps_zero_cost && dep_class == ITANIUM_CLASS_STF)
+	/* Assume there will be no cache conflict for floating-point data.
+	   For integer data, L1 conflict penalty is huge (17 cycles), so we
+	   never assume it will not cause a conflict.  */
+	return 0;
+      else
+	return cost;
+    }
+
+  if (dep_type != REG_DEP_OUTPUT)
+    return cost;
+
   if (dep_class == ITANIUM_CLASS_ST || dep_class == ITANIUM_CLASS_STF
       || insn_class == ITANIUM_CLASS_ST || insn_class == ITANIUM_CLASS_STF)
     return 0;
@@ -6474,7 +6555,7 @@ ia64_sched_init (FILE *dump ATTRIBUTE_UNUSED,
 #ifdef ENABLE_CHECKING
   rtx insn;
 
-  if (reload_completed)
+  if (!sel_sched_p () && reload_completed)
     for (insn = NEXT_INSN (current_sched_info->prev_head);
 	 insn != current_sched_info->next_tail;
 	 insn = NEXT_INSN (insn))
@@ -6482,6 +6563,9 @@ ia64_sched_init (FILE *dump ATTRIBUTE_UNUSED,
 #endif
   last_scheduled_insn = NULL_RTX;
   init_insn_group_barriers ();
+
+  current_cycle = 0;
+  memset (mem_ops_in_group, 0, sizeof (mem_ops_in_group));
 }
 
 /* We're beginning a scheduling pass.  Check assertion.  */
@@ -6491,7 +6575,7 @@ ia64_sched_init_global (FILE *dump ATTRIBUTE_UNUSED,
                         int sched_verbose ATTRIBUTE_UNUSED,
                         int max_ready ATTRIBUTE_UNUSED)
 {  
-  gcc_assert (!pending_data_specs);
+  gcc_assert (pending_data_specs == 0);
 }
 
 /* Scheduling pass is now finished.  Free/reset static variable.  */
@@ -6499,9 +6583,41 @@ static void
 ia64_sched_finish_global (FILE *dump ATTRIBUTE_UNUSED,
 			  int sched_verbose ATTRIBUTE_UNUSED)
 {
-  free (spec_check_no);
-  spec_check_no = 0;
-  max_uid = 0;
+  gcc_assert (pending_data_specs == 0);
+}
+
+/* Return TRUE if INSN is a load (either normal or speculative, but not a
+   speculation check), FALSE otherwise.  */
+static bool
+is_load_p (rtx insn)
+{
+  enum attr_itanium_class insn_class = ia64_safe_itanium_class (insn);
+
+  return
+   ((insn_class == ITANIUM_CLASS_LD || insn_class == ITANIUM_CLASS_FLD)
+    && get_attr_check_load (insn) == CHECK_LOAD_NO);
+}
+
+/* If INSN is a memory reference, memoize it in MEM_OPS_IN_GROUP global array
+   (taking account for 3-cycle cache reference postponing for stores: Intel
+   Itanium 2 Reference Manual for Software Development and Optimization,
+   6.7.3.1).  */
+static void
+record_memory_reference (rtx insn)
+{
+  enum attr_itanium_class insn_class = ia64_safe_itanium_class (insn);
+
+  switch (insn_class) {
+    case ITANIUM_CLASS_FLD:
+    case ITANIUM_CLASS_LD:
+      mem_ops_in_group[current_cycle % 4]++;
+      break;
+    case ITANIUM_CLASS_STF:
+    case ITANIUM_CLASS_ST:
+      mem_ops_in_group[(current_cycle + 3) % 4]++;
+      break;
+    default:;
+  }
 }
 
 /* We are about to being issuing insns for this clock cycle.
@@ -6509,7 +6625,7 @@ ia64_sched_finish_global (FILE *dump ATTRIBUTE_UNUSED,
 
 static int
 ia64_dfa_sched_reorder (FILE *dump, int sched_verbose, rtx *ready,
-			int *pn_ready, int clock_var ATTRIBUTE_UNUSED,
+			int *pn_ready, int clock_var,
 			int reorder_type)
 {
   int n_asms;
@@ -6589,6 +6705,27 @@ ia64_dfa_sched_reorder (FILE *dump, int sched_verbose, rtx *ready,
       ready += deleted;
     }
 
+  current_cycle = clock_var;
+  if (reload_completed && mem_ops_in_group[clock_var % 4] >= ia64_max_memory_insns)
+    {
+      int moved = 0;
+
+      insnp = e_ready;
+      /* Move down loads/stores, preserving relative order.  */
+      while (insnp-- > ready + moved)
+	while (insnp >= ready + moved)
+	  {
+	    rtx insn = *insnp;
+	    if (! is_load_p (insn))
+	      break;
+	    memmove (ready + 1, ready, (insnp - ready) * sizeof (rtx));
+	    *ready = insn;
+	    moved++;
+	  }
+      n_ready -= moved;
+      ready += moved;
+    }
+
   return 1;
 }
 
@@ -6626,9 +6763,9 @@ ia64_variable_issue (FILE *dump ATTRIBUTE_UNUSED,
 		     rtx insn ATTRIBUTE_UNUSED,
 		     int can_issue_more ATTRIBUTE_UNUSED)
 {
-  if (current_sched_info->flags & DO_SPECULATION)
+  if (sched_deps_info->generate_spec_deps && !sel_sched_p ())
     /* Modulo scheduling does not extend h_i_d when emitting
-       new instructions.  Deal with it.  */
+       new instructions.  Don't use h_i_d, if we don't have to.  */
     {
       if (DONE_SPEC (insn) & BEGIN_DATA)
 	pending_data_specs++;
@@ -6647,6 +6784,8 @@ ia64_variable_issue (FILE *dump ATTRIBUTE_UNUSED,
 	init_insn_group_barriers ();
       stops_p [INSN_UID (insn)] = stop_before_p;
       stop_before_p = 0;
+
+      record_memory_reference (insn);
     }
   return 1;
 }
@@ -6657,10 +6796,13 @@ ia64_variable_issue (FILE *dump ATTRIBUTE_UNUSED,
 static int
 ia64_first_cycle_multipass_dfa_lookahead_guard (rtx insn)
 {
-  gcc_assert (insn  && INSN_P (insn));
+  gcc_assert (insn && INSN_P (insn));
   return ((!reload_completed
 	   || !safe_group_barrier_needed (insn))
-	  && ia64_first_cycle_multipass_dfa_lookahead_guard_spec (insn));
+	  && ia64_first_cycle_multipass_dfa_lookahead_guard_spec (insn)
+	  && (!mflag_sched_mem_insns_hard_limit
+	      || !is_load_p (insn)
+	      || mem_ops_in_group[current_cycle % 4] < ia64_max_memory_insns));
 }
 
 /* We are choosing insn from the ready queue.  Return nonzero if INSN
@@ -6682,6 +6824,26 @@ ia64_first_cycle_multipass_dfa_lookahead_guard_spec (const_rtx insn)
 
 static rtx dfa_pre_cycle_insn;
 
+/* Returns 1 when a meaningful insn was scheduled between the last group
+   barrier and LAST.  */
+static int
+scheduled_good_insn (rtx last)
+{
+  if (last && recog_memoized (last) >= 0)
+    return 1;
+
+  for ( ;
+       last != NULL && !NOTE_INSN_BASIC_BLOCK_P (last)
+       && !stops_p[INSN_UID (last)];
+       last = PREV_INSN (last))
+    /* We could hit a NOTE_INSN_DELETED here which is actually outside
+       the ebb we're scheduling.  */
+    if (INSN_P (last) && recog_memoized (last) >= 0)
+      return 1;
+
+  return 0;
+}
+
 /* We are about to being issuing INSN.  Return nonzero if we cannot
    issue it on given cycle CLOCK and return zero if we should not sort
    the ready queue on the next clock start.  */
@@ -6693,17 +6855,32 @@ ia64_dfa_new_cycle (FILE *dump, int verbose, rtx insn, int last_clock,
   int setup_clocks_p = FALSE;
 
   gcc_assert (insn && INSN_P (insn));
-  if ((reload_completed && safe_group_barrier_needed (insn))
+  /* When a group barrier is needed for insn, last_scheduled_insn
+     should be set.  */
+  gcc_assert (!(reload_completed && safe_group_barrier_needed (insn))
+              || last_scheduled_insn);
+
+  if ((reload_completed
+       && (safe_group_barrier_needed (insn)
+	   || (mflag_sched_stop_bits_after_every_cycle
+	       && last_clock != clock
+	       && last_scheduled_insn
+	       && scheduled_good_insn (last_scheduled_insn))))
       || (last_scheduled_insn
 	  && (GET_CODE (last_scheduled_insn) == CALL_INSN
 	      || GET_CODE (PATTERN (last_scheduled_insn)) == ASM_INPUT
 	      || asm_noperands (PATTERN (last_scheduled_insn)) >= 0)))
     {
       init_insn_group_barriers ();
+
       if (verbose && dump)
 	fprintf (dump, "//    Stop should be before %d%s\n", INSN_UID (insn),
 		 last_clock == clock ? " + cycle advance" : "");
+
       stop_before_p = 1;
+      current_cycle = clock;
+      mem_ops_in_group[current_cycle % 4] = 0;
+
       if (last_clock == clock)
 	{
 	  state_transition (curr_state, dfa_stop_insn);
@@ -6716,19 +6893,24 @@ ia64_dfa_new_cycle (FILE *dump, int verbose, rtx insn, int last_clock,
 	}
       else if (reload_completed)
 	setup_clocks_p = TRUE;
-      if (GET_CODE (PATTERN (last_scheduled_insn)) == ASM_INPUT
-	  || asm_noperands (PATTERN (last_scheduled_insn)) >= 0)
-	state_reset (curr_state);
-      else
+
+      if (last_scheduled_insn)
 	{
-	  memcpy (curr_state, prev_cycle_state, dfa_state_size);
-	  state_transition (curr_state, dfa_stop_insn);
-	  state_transition (curr_state, dfa_pre_cycle_insn);
-	  state_transition (curr_state, NULL);
+	  if (GET_CODE (PATTERN (last_scheduled_insn)) == ASM_INPUT
+	      || asm_noperands (PATTERN (last_scheduled_insn)) >= 0)
+	    state_reset (curr_state);
+	  else
+	    {
+	      memcpy (curr_state, prev_cycle_state, dfa_state_size);
+	      state_transition (curr_state, dfa_stop_insn);
+	      state_transition (curr_state, dfa_pre_cycle_insn);
+	      state_transition (curr_state, NULL);
+	    }
 	}
     }
   else if (reload_completed)
     setup_clocks_p = TRUE;
+
   if (setup_clocks_p && ia64_tune == PROCESSOR_ITANIUM
       && GET_CODE (PATTERN (insn)) != ASM_INPUT
       && asm_noperands (PATTERN (insn)) < 0)
@@ -6759,6 +6941,7 @@ ia64_dfa_new_cycle (FILE *dump, int verbose, rtx insn, int last_clock,
 	    add_cycles [INSN_UID (insn)] = 3 - d;
 	}
     }
+
   return 0;
 }
 
@@ -6767,18 +6950,9 @@ ia64_dfa_new_cycle (FILE *dump, int verbose, rtx insn, int last_clock,
 static void
 ia64_h_i_d_extended (void)
 {
-  if (current_sched_info->flags & DO_SPECULATION)
-    {
-      int new_max_uid = get_max_uid () + 1;
-
-      spec_check_no = (int *) xrecalloc (spec_check_no, new_max_uid,
-				 max_uid, sizeof (*spec_check_no));
-      max_uid = new_max_uid;
-    }
-
   if (stops_p != NULL) 
     {
-      int new_clocks_length = get_max_uid () + 1;
+      int new_clocks_length = get_max_uid () * 3 / 2;
       
       stops_p = (char *) xrecalloc (stops_p, new_clocks_length, clocks_length, 1);
       
@@ -6793,6 +6967,203 @@ ia64_h_i_d_extended (void)
       clocks_length = new_clocks_length;
     }
 }
+
+
+/* This structure describes the data used by the backend to guide scheduling.
+   When the current scheduling point is switched, this data should be saved
+   and restored later, if the scheduler returns to this point.  */
+struct _ia64_sched_context
+{
+  state_t prev_cycle_state;
+  rtx last_scheduled_insn;
+  struct reg_write_state rws_sum[NUM_REGS];
+  struct reg_write_state rws_insn[NUM_REGS];
+  int first_instruction;
+  int pending_data_specs;
+  int current_cycle;
+  char mem_ops_in_group[4];
+};
+typedef struct _ia64_sched_context *ia64_sched_context_t;
+
+/* Allocates a scheduling context.  */
+static void *
+ia64_alloc_sched_context (void)
+{
+  return xmalloc (sizeof (struct _ia64_sched_context));
+}
+
+/* Initializes the _SC context with clean data, if CLEAN_P, and from
+   the global context otherwise.  */
+static void
+ia64_init_sched_context (void *_sc, bool clean_p)
+{
+  ia64_sched_context_t sc = (ia64_sched_context_t) _sc;
+
+  sc->prev_cycle_state = xmalloc (dfa_state_size);
+  if (clean_p)
+    {
+      state_reset (sc->prev_cycle_state);
+      sc->last_scheduled_insn = NULL_RTX;
+      memset (sc->rws_sum, 0, sizeof (rws_sum));
+      memset (sc->rws_insn, 0, sizeof (rws_insn));
+      sc->first_instruction = 1;
+      sc->pending_data_specs = 0;
+      sc->current_cycle = 0;
+      memset (sc->mem_ops_in_group, 0, sizeof (mem_ops_in_group));
+    }
+  else
+    {
+      memcpy (sc->prev_cycle_state, prev_cycle_state, dfa_state_size);
+      sc->last_scheduled_insn = last_scheduled_insn;
+      memcpy (sc->rws_sum, rws_sum, sizeof (rws_sum));
+      memcpy (sc->rws_insn, rws_insn, sizeof (rws_insn));
+      sc->first_instruction = first_instruction;
+      sc->pending_data_specs = pending_data_specs;
+      sc->current_cycle = current_cycle;
+      memcpy (sc->mem_ops_in_group, mem_ops_in_group, sizeof (mem_ops_in_group));
+    }
+}
+
+/* Sets the global scheduling context to the one pointed to by _SC.  */
+static void
+ia64_set_sched_context (void *_sc)
+{
+  ia64_sched_context_t sc = (ia64_sched_context_t) _sc;
+
+  gcc_assert (sc != NULL);
+
+  memcpy (prev_cycle_state, sc->prev_cycle_state, dfa_state_size);
+  last_scheduled_insn = sc->last_scheduled_insn;
+  memcpy (rws_sum, sc->rws_sum, sizeof (rws_sum));
+  memcpy (rws_insn, sc->rws_insn, sizeof (rws_insn));
+  first_instruction = sc->first_instruction;
+  pending_data_specs = sc->pending_data_specs;
+  current_cycle = sc->current_cycle;
+  memcpy (mem_ops_in_group, sc->mem_ops_in_group, sizeof (mem_ops_in_group));
+}
+
+/* Clears the data in the _SC scheduling context.  */
+static void
+ia64_clear_sched_context (void *_sc)
+{
+  ia64_sched_context_t sc = (ia64_sched_context_t) _sc;
+  
+  free (sc->prev_cycle_state);
+  sc->prev_cycle_state = NULL;
+}
+
+/* Frees the _SC scheduling context.  */
+static void
+ia64_free_sched_context (void *_sc)
+{
+  gcc_assert (_sc != NULL);
+
+  free (_sc);
+}
+
+typedef rtx (* gen_func_t) (rtx, rtx);
+
+/* Return a function that will generate a load of mode MODE_NO
+   with speculation types TS.  */
+static gen_func_t
+get_spec_load_gen_function (ds_t ts, int mode_no)
+{
+  static gen_func_t gen_ld_[] = {
+    gen_movbi,
+    gen_movqi_internal,
+    gen_movhi_internal,
+    gen_movsi_internal,
+    gen_movdi_internal,
+    gen_movsf_internal,
+    gen_movdf_internal,
+    gen_movxf_internal,
+    gen_movti_internal,
+    gen_zero_extendqidi2,
+    gen_zero_extendhidi2,
+    gen_zero_extendsidi2,
+  };
+
+  static gen_func_t gen_ld_a[] = {
+    gen_movbi_advanced,
+    gen_movqi_advanced,
+    gen_movhi_advanced,
+    gen_movsi_advanced,
+    gen_movdi_advanced,
+    gen_movsf_advanced,
+    gen_movdf_advanced,
+    gen_movxf_advanced,
+    gen_movti_advanced,
+    gen_zero_extendqidi2_advanced,
+    gen_zero_extendhidi2_advanced,
+    gen_zero_extendsidi2_advanced,
+  };
+  static gen_func_t gen_ld_s[] = {
+    gen_movbi_speculative,
+    gen_movqi_speculative,
+    gen_movhi_speculative,
+    gen_movsi_speculative,
+    gen_movdi_speculative,
+    gen_movsf_speculative,
+    gen_movdf_speculative,
+    gen_movxf_speculative,
+    gen_movti_speculative,
+    gen_zero_extendqidi2_speculative,
+    gen_zero_extendhidi2_speculative,
+    gen_zero_extendsidi2_speculative,
+  };
+  static gen_func_t gen_ld_sa[] = {
+    gen_movbi_speculative_advanced,
+    gen_movqi_speculative_advanced,
+    gen_movhi_speculative_advanced,
+    gen_movsi_speculative_advanced,
+    gen_movdi_speculative_advanced,
+    gen_movsf_speculative_advanced,
+    gen_movdf_speculative_advanced,
+    gen_movxf_speculative_advanced,
+    gen_movti_speculative_advanced,
+    gen_zero_extendqidi2_speculative_advanced,
+    gen_zero_extendhidi2_speculative_advanced,
+    gen_zero_extendsidi2_speculative_advanced,
+  };
+  static gen_func_t gen_ld_s_a[] = {
+    gen_movbi_speculative_a,
+    gen_movqi_speculative_a,
+    gen_movhi_speculative_a,
+    gen_movsi_speculative_a,
+    gen_movdi_speculative_a,
+    gen_movsf_speculative_a,
+    gen_movdf_speculative_a,
+    gen_movxf_speculative_a,
+    gen_movti_speculative_a,
+    gen_zero_extendqidi2_speculative_a,
+    gen_zero_extendhidi2_speculative_a,
+    gen_zero_extendsidi2_speculative_a,
+  };
+
+  gen_func_t *gen_ld;
+
+  if (ts & BEGIN_DATA)
+    {
+      if (ts & BEGIN_CONTROL)
+	gen_ld = gen_ld_sa;
+      else
+	gen_ld = gen_ld_a;
+    }
+  else if (ts & BEGIN_CONTROL)
+    {
+      if ((spec_info->flags & SEL_SCHED_SPEC_DONT_CHECK_CONTROL)
+	  || ia64_needs_block_p (ts))
+	gen_ld = gen_ld_s;
+      else
+	gen_ld = gen_ld_s_a;
+    }
+  else if (ts == 0)
+    gen_ld = gen_ld_;
+  else
+    gcc_unreachable ();
+
+  return gen_ld[mode_no];
+}
 
 /* Constants that help mapping 'enum machine_mode' to int.  */
 enum SPEC_MODES
@@ -6804,6 +7175,12 @@ enum SPEC_MODES
     SPEC_MODE_LAST = 8
   };
 
+enum
+  {
+    /* Offset to reach ZERO_EXTEND patterns.  */
+    SPEC_GEN_EXTEND_OFFSET = SPEC_MODE_LAST - SPEC_MODE_FOR_EXTEND_FIRST + 1
+  };
+
 /* Return index of the MODE.  */
 static int
 ia64_mode_to_int (enum machine_mode mode)
@@ -6834,25 +7211,29 @@ ia64_set_sched_flags (spec_info_t spec_info)
   unsigned int *flags = &(current_sched_info->flags);
 
   if (*flags & SCHED_RGN
-      || *flags & SCHED_EBB)  
+      || *flags & SCHED_EBB
+      || *flags & SEL_SCHED)
     {
       int mask = 0;
 
       if ((mflag_sched_br_data_spec && !reload_completed && optimize > 0)
-	  || (mflag_sched_ar_data_spec && reload_completed))
+          || (mflag_sched_ar_data_spec && reload_completed))
 	{
 	  mask |= BEGIN_DATA;
-	  
-	  if ((mflag_sched_br_in_data_spec && !reload_completed)
-	      || (mflag_sched_ar_in_data_spec && reload_completed))
+
+	  if (!sel_sched_p ()
+	      && ((mflag_sched_br_in_data_spec && !reload_completed)
+		  || (mflag_sched_ar_in_data_spec && reload_completed)))
 	    mask |= BE_IN_DATA;
 	}
       
-      if (mflag_sched_control_spec)
+      if (mflag_sched_control_spec
+          && (!sel_sched_p ()
+	      || reload_completed))
 	{
 	  mask |= BEGIN_CONTROL;
 	  
-	  if (mflag_sched_in_control_spec)
+	  if (!sel_sched_p () && mflag_sched_in_control_spec)
 	    mask |= BE_IN_CONTROL;
 	}
 
@@ -6870,17 +7251,17 @@ ia64_set_sched_flags (spec_info_t spec_info)
 	  if ((mask & DATA_SPEC) && mflag_sched_prefer_non_data_spec_insns)
 	    spec_info->flags |= PREFER_NON_DATA_SPEC;
 
-	  if ((mask & CONTROL_SPEC)
-	      && mflag_sched_prefer_non_control_spec_insns)
-	    spec_info->flags |= PREFER_NON_CONTROL_SPEC;
-
-	  if (mflag_sched_spec_verbose)
+	  if (mask & CONTROL_SPEC)
 	    {
-	      if (sched_verbose >= 1)
-		spec_info->dump = sched_dump;
-	      else
-		spec_info->dump = stderr;
+	      if (mflag_sched_prefer_non_control_spec_insns)
+		spec_info->flags |= PREFER_NON_CONTROL_SPEC;
+
+	      if (sel_sched_p () && mflag_sel_sched_dont_check_control_spec)
+		spec_info->flags |= SEL_SCHED_SPEC_DONT_CHECK_CONTROL;
 	    }
+
+	  if (sched_verbose >= 1)
+	    spec_info->dump = sched_dump;
 	  else
 	    spec_info->dump = 0;
 	  
@@ -6890,304 +7271,284 @@ ia64_set_sched_flags (spec_info_t spec_info)
     }
 }
 
-/* Implement targetm.sched.speculate_insn hook.
-   Check if the INSN can be TS speculative.
-   If 'no' - return -1.
-   If 'yes' - generate speculative pattern in the NEW_PAT and return 1.
-   If current pattern of the INSN already provides TS speculation, return 0.  */
+/* If INSN is an appropriate load return its mode.
+   Return -1 otherwise.  */
 static int
-ia64_speculate_insn (rtx insn, ds_t ts, rtx *new_pat)
-{  
-  rtx pat, reg, mem, mem_reg;
-  int mode_no, gen_p = 1;
+get_mode_no_for_insn (rtx insn)
+{
+  rtx reg, mem, mode_rtx;
+  int mode_no;
   bool extend_p;
-  
-  gcc_assert (!(ts & ~BEGIN_SPEC) && ts);
-           
-  pat = PATTERN (insn);
 
-  if (GET_CODE (pat) == COND_EXEC)
-    pat = COND_EXEC_CODE (pat);
+  extract_insn_cached (insn);
 
-  /* This should be a SET ...  */
-  if (GET_CODE (pat) != SET)
-    return -1;
+  /* We use WHICH_ALTERNATIVE only after reload.  This will
+     guarantee that reload won't touch a speculative insn.  */
 
-  reg = SET_DEST (pat);
-  /* ... to the general/fp register ...  */
-  if (!REG_P (reg) || !(GR_REGNO_P (REGNO (reg)) || FP_REGNO_P (REGNO (reg))))
+  if (recog_data.n_operands != 2)
     return -1;
 
-  /* ... from the mem ...  */
-  mem = SET_SRC (pat);
+  reg = recog_data.operand[0];
+  mem = recog_data.operand[1];
 
-  /* ... that can, possibly, be a zero_extend ...  */
-  if (GET_CODE (mem) == ZERO_EXTEND)
+  /* We should use MEM's mode since REG's mode in presence of
+     ZERO_EXTEND will always be DImode.  */
+  if (get_attr_speculable1 (insn) == SPECULABLE1_YES)
+    /* Process non-speculative ld.  */
+    {
+      if (!reload_completed)
+	{
+	  /* Do not speculate into regs like ar.lc.  */
+	  if (!REG_P (reg) || AR_REGNO_P (REGNO (reg)))
+	    return -1;
+
+	  if (!MEM_P (mem))
+	    return -1;
+
+	  {
+	    rtx mem_reg = XEXP (mem, 0);
+
+	    if (!REG_P (mem_reg))
+	      return -1;
+	  }
+
+	  mode_rtx = mem;
+	}
+      else if (get_attr_speculable2 (insn) == SPECULABLE2_YES)
+	{
+	  gcc_assert (REG_P (reg) && MEM_P (mem));
+	  mode_rtx = mem;
+	}
+      else
+	return -1;
+    }
+  else if (get_attr_data_speculative (insn) == DATA_SPECULATIVE_YES
+	   || get_attr_control_speculative (insn) == CONTROL_SPECULATIVE_YES
+	   || get_attr_check_load (insn) == CHECK_LOAD_YES)
+    /* Process speculative ld or ld.c.  */
     {
-      mem = XEXP (mem, 0);
-      extend_p = true;      
+      gcc_assert (REG_P (reg) && MEM_P (mem));
+      mode_rtx = mem;
     }
   else
-    extend_p = false;
-
-  /* ... or a speculative load.  */
-  if (GET_CODE (mem) == UNSPEC)
     {
-      int code;
-      
-      code = XINT (mem, 1);
-      if (code != UNSPEC_LDA && code != UNSPEC_LDS && code != UNSPEC_LDSA)
-	return -1;
+      enum attr_itanium_class attr_class = get_attr_itanium_class (insn);
 
-      if ((code == UNSPEC_LDA && !(ts & BEGIN_CONTROL))
-	  || (code == UNSPEC_LDS && !(ts & BEGIN_DATA))
-	  || code == UNSPEC_LDSA)
-	gen_p = 0;
-
-      mem = XVECEXP (mem, 0, 0);
-      gcc_assert (MEM_P (mem));
+      if (attr_class == ITANIUM_CLASS_CHK_A
+	  || attr_class == ITANIUM_CLASS_CHK_S_I
+	  || attr_class == ITANIUM_CLASS_CHK_S_F)
+	/* Process chk.  */
+	mode_rtx = reg;
+      else
+	return -1;
     }
 
-  /* Source should be a mem ...  */
-  if (!MEM_P (mem))
-    return -1;
+  mode_no = ia64_mode_to_int (GET_MODE (mode_rtx));
 
-  /* ... addressed by a register.  */
-  mem_reg = XEXP (mem, 0);
-  if (!REG_P (mem_reg))
-    return -1;
-     
-  /* We should use MEM's mode since REG's mode in presence of ZERO_EXTEND
-     will always be DImode.  */
-  mode_no = ia64_mode_to_int (GET_MODE (mem));
-  
-  if (mode_no == SPEC_MODE_INVALID
-      || (extend_p
-	  && !(SPEC_MODE_FOR_EXTEND_FIRST <= mode_no
-	       && mode_no <= SPEC_MODE_FOR_EXTEND_LAST)))
+  if (mode_no == SPEC_MODE_INVALID)
     return -1;
 
-  extract_insn_cached (insn);
-  gcc_assert (reg == recog_data.operand[0] && mem == recog_data.operand[1]);
+  extend_p = (GET_MODE (reg) != GET_MODE (mode_rtx));
 
-  *new_pat = ia64_gen_spec_insn (insn, ts, mode_no, gen_p != 0, extend_p);
+  if (extend_p)
+    {
+      if (!(SPEC_MODE_FOR_EXTEND_FIRST <= mode_no
+	    && mode_no <= SPEC_MODE_FOR_EXTEND_LAST))
+	return -1;
 
-  return gen_p;
+      mode_no += SPEC_GEN_EXTEND_OFFSET;
+    }
+
+  return mode_no;
 }
 
-enum
-  {
-    /* Offset to reach ZERO_EXTEND patterns.  */
-    SPEC_GEN_EXTEND_OFFSET = SPEC_MODE_LAST - SPEC_MODE_FOR_EXTEND_FIRST + 1,
-    /* Number of patterns for each speculation mode.  */
-    SPEC_N = (SPEC_MODE_LAST
-              + SPEC_MODE_FOR_EXTEND_LAST - SPEC_MODE_FOR_EXTEND_FIRST + 2)
-  };
+/* If X is an unspec part of a speculative load, return its code.
+   Return -1 otherwise.  */
+static int
+get_spec_unspec_code (const_rtx x)
+{
+  if (GET_CODE (x) != UNSPEC)
+    return -1;
 
-enum SPEC_GEN_LD_MAP
   {
-    /* Offset to ld.a patterns.  */
-    SPEC_GEN_A = 0 * SPEC_N,
-    /* Offset to ld.s patterns.  */
-    SPEC_GEN_S = 1 * SPEC_N,
-    /* Offset to ld.sa patterns.  */
-    SPEC_GEN_SA = 2 * SPEC_N,
-    /* Offset to ld.sa patterns.  For this patterns corresponding ld.c will
-       mutate to chk.s.  */
-    SPEC_GEN_SA_FOR_S = 3 * SPEC_N
-  };
+    int code;
 
-/* These offsets are used to get (4 * SPEC_N).  */
-enum SPEC_GEN_CHECK_OFFSET
-  {
-    SPEC_GEN_CHKA_FOR_A_OFFSET = 4 * SPEC_N - SPEC_GEN_A,
-    SPEC_GEN_CHKA_FOR_SA_OFFSET = 4 * SPEC_N - SPEC_GEN_SA
-  };
+    code = XINT (x, 1);
 
-/* If GEN_P is true, calculate the index of needed speculation check and return
-   speculative pattern for INSN with speculative mode TS, machine mode
-   MODE_NO and with ZERO_EXTEND (if EXTEND_P is true).
-   If GEN_P is false, just calculate the index of needed speculation check.  */
-static rtx
-ia64_gen_spec_insn (rtx insn, ds_t ts, int mode_no, bool gen_p, bool extend_p)
+    switch (code)
+      {
+      case UNSPEC_LDA:
+      case UNSPEC_LDS:
+      case UNSPEC_LDS_A:
+      case UNSPEC_LDSA:
+	return code;
+
+      default:
+	return -1;
+      }
+  }
+}
+
+/* Implement skip_rtx_p hook.  */
+static bool
+ia64_skip_rtx_p (const_rtx x)
 {
-  rtx pat, new_pat;
-  int load_no;
-  int shift = 0;
+  return get_spec_unspec_code (x) != -1;
+}
 
-  static rtx (* const gen_load[]) (rtx, rtx) = {
-    gen_movbi_advanced,
-    gen_movqi_advanced,
-    gen_movhi_advanced,
-    gen_movsi_advanced,
-    gen_movdi_advanced,
-    gen_movsf_advanced,
-    gen_movdf_advanced,
-    gen_movxf_advanced,
-    gen_movti_advanced,
-    gen_zero_extendqidi2_advanced,
-    gen_zero_extendhidi2_advanced,
-    gen_zero_extendsidi2_advanced,
+/* If INSN is a speculative load, return its UNSPEC code.
+   Return -1 otherwise.  */
+static int
+get_insn_spec_code (const_rtx insn)
+{
+  rtx pat, reg, mem;
 
-    gen_movbi_speculative,
-    gen_movqi_speculative,
-    gen_movhi_speculative,
-    gen_movsi_speculative,
-    gen_movdi_speculative,
-    gen_movsf_speculative,
-    gen_movdf_speculative,
-    gen_movxf_speculative,
-    gen_movti_speculative,
-    gen_zero_extendqidi2_speculative,
-    gen_zero_extendhidi2_speculative,
-    gen_zero_extendsidi2_speculative,
+  pat = PATTERN (insn);
 
-    gen_movbi_speculative_advanced,
-    gen_movqi_speculative_advanced,
-    gen_movhi_speculative_advanced,
-    gen_movsi_speculative_advanced,
-    gen_movdi_speculative_advanced,
-    gen_movsf_speculative_advanced,
-    gen_movdf_speculative_advanced,
-    gen_movxf_speculative_advanced,
-    gen_movti_speculative_advanced,
-    gen_zero_extendqidi2_speculative_advanced,
-    gen_zero_extendhidi2_speculative_advanced,
-    gen_zero_extendsidi2_speculative_advanced,
+  if (GET_CODE (pat) == COND_EXEC)
+    pat = COND_EXEC_CODE (pat);
 
-    gen_movbi_speculative_advanced,
-    gen_movqi_speculative_advanced,
-    gen_movhi_speculative_advanced,
-    gen_movsi_speculative_advanced,
-    gen_movdi_speculative_advanced,
-    gen_movsf_speculative_advanced,
-    gen_movdf_speculative_advanced,
-    gen_movxf_speculative_advanced,
-    gen_movti_speculative_advanced,
-    gen_zero_extendqidi2_speculative_advanced,
-    gen_zero_extendhidi2_speculative_advanced,
-    gen_zero_extendsidi2_speculative_advanced
-  };
+  if (GET_CODE (pat) != SET)
+    return -1;
 
-  load_no = extend_p ? mode_no + SPEC_GEN_EXTEND_OFFSET : mode_no;
+  reg = SET_DEST (pat);
+  if (!REG_P (reg))
+    return -1;
 
-  if (ts & BEGIN_DATA)
+  mem = SET_SRC (pat);
+  if (GET_CODE (mem) == ZERO_EXTEND)
+    mem = XEXP (mem, 0);
+
+  return get_spec_unspec_code (mem);
+}
+
+/* If INSN is a speculative load, return a ds with the speculation types.
+   Otherwise [if INSN is a normal instruction] return 0.  */
+static ds_t
+ia64_get_insn_spec_ds (rtx insn)
+{
+  int code = get_insn_spec_code (insn);
+
+  switch (code)
     {
-      /* We don't need recovery because even if this is ld.sa
-	 ALAT entry will be allocated only if NAT bit is set to zero. 
-	 So it is enough to use ld.c here.  */	  
+    case UNSPEC_LDA:
+      return BEGIN_DATA;
 
-      if (ts & BEGIN_CONTROL)
-	{	      
-	  load_no += SPEC_GEN_SA;
+    case UNSPEC_LDS:
+    case UNSPEC_LDS_A:
+      return BEGIN_CONTROL;
 
-	  if (!mflag_sched_ldc)
-	    shift = SPEC_GEN_CHKA_FOR_SA_OFFSET;
-	}
-      else
-	{
-	  load_no += SPEC_GEN_A;
+    case UNSPEC_LDSA:
+      return BEGIN_DATA | BEGIN_CONTROL;
 
-	  if (!mflag_sched_ldc)		
-	    shift = SPEC_GEN_CHKA_FOR_A_OFFSET;
-	}
+    default:
+      return 0;
     }
-  else if (ts & BEGIN_CONTROL)
+}
+
+/* If INSN is a speculative load return a ds with the speculation types that
+   will be checked.
+   Otherwise [if INSN is a normal instruction] return 0.  */
+static ds_t
+ia64_get_insn_checked_ds (rtx insn)
+{
+  int code = get_insn_spec_code (insn);
+
+  switch (code)
     {
-      /* ld.sa can be used instead of ld.s to avoid basic block splitting.  */
-      if (!mflag_control_ldc)
-	load_no += SPEC_GEN_S;
-      else
-	{
-	  gcc_assert (mflag_sched_ldc);
-	  load_no += SPEC_GEN_SA_FOR_S;
-	}
+    case UNSPEC_LDA:
+      return BEGIN_DATA | BEGIN_CONTROL;
+
+    case UNSPEC_LDS:
+      return BEGIN_CONTROL;
+
+    case UNSPEC_LDS_A:
+    case UNSPEC_LDSA:
+      return BEGIN_DATA | BEGIN_CONTROL;
+
+    default:
+      return 0;
     }
-  else
-    gcc_unreachable ();
+}
 
-  /* Set the desired check index.  We add '1', because zero element in this
-     array means, that instruction with such uid is non-speculative.  */
-  spec_check_no[INSN_UID (insn)] = load_no + shift + 1;
+/* If GEN_P is true, calculate the index of needed speculation check and return
+   speculative pattern for INSN with speculative mode TS, machine mode
+   MODE_NO and with ZERO_EXTEND (if EXTEND_P is true).
+   If GEN_P is false, just calculate the index of needed speculation check.  */
+static rtx
+ia64_gen_spec_load (rtx insn, ds_t ts, int mode_no)
+{
+  rtx pat, new_pat;
+  gen_func_t gen_load;
 
-  if (!gen_p)
-    return 0;
+  gen_load = get_spec_load_gen_function (ts, mode_no);
 
-  new_pat = gen_load[load_no] (copy_rtx (recog_data.operand[0]),
-			       copy_rtx (recog_data.operand[1]));
+  new_pat = gen_load (copy_rtx (recog_data.operand[0]),
+		      copy_rtx (recog_data.operand[1]));
 
   pat = PATTERN (insn);
   if (GET_CODE (pat) == COND_EXEC)
-    new_pat = gen_rtx_COND_EXEC (VOIDmode, copy_rtx 
-				 (COND_EXEC_TEST (pat)), new_pat);
+    new_pat = gen_rtx_COND_EXEC (VOIDmode, copy_rtx (COND_EXEC_TEST (pat)),
+				 new_pat);
 
   return new_pat;
 }
 
-/* Offset to branchy checks.  */
-enum { SPEC_GEN_CHECK_MUTATION_OFFSET = 5 * SPEC_N };
-
-/* Return nonzero, if INSN needs branchy recovery check.  */
 static bool
-ia64_needs_block_p (const_rtx insn)
+insn_can_be_in_speculative_p (rtx insn ATTRIBUTE_UNUSED,
+			      ds_t ds ATTRIBUTE_UNUSED)
 {
-  int check_no;
+  return false;
+}
 
-  check_no = spec_check_no[INSN_UID(insn)] - 1;
-  gcc_assert (0 <= check_no && check_no < SPEC_GEN_CHECK_MUTATION_OFFSET);
+/* Implement targetm.sched.speculate_insn hook.
+   Check if the INSN can be TS speculative.
+   If 'no' - return -1.
+   If 'yes' - generate speculative pattern in the NEW_PAT and return 1.
+   If current pattern of the INSN already provides TS speculation,
+   return 0.  */
+static int
+ia64_speculate_insn (rtx insn, ds_t ts, rtx *new_pat)
+{  
+  int mode_no;
+  int res;
+  
+  gcc_assert (!(ts & ~SPECULATIVE));
 
-  return ((SPEC_GEN_S <= check_no && check_no < SPEC_GEN_S + SPEC_N)
-	  || (4 * SPEC_N <= check_no && check_no < 4 * SPEC_N + SPEC_N));
-}
+  if (ia64_spec_check_p (insn))
+    return -1;
 
-/* Generate (or regenerate, if (MUTATE_P)) recovery check for INSN.
-   If (LABEL != 0 || MUTATE_P), generate branchy recovery check.
-   Otherwise, generate a simple check.  */
-static rtx
-ia64_gen_check (rtx insn, rtx label, bool mutate_p)
-{
-  rtx op1, pat, check_pat;
+  if ((ts & BE_IN_SPEC)
+      && !insn_can_be_in_speculative_p (insn, ts))
+    return -1;
 
-  static rtx (* const gen_check[]) (rtx, rtx) = {
-    gen_movbi_clr,
-    gen_movqi_clr,
-    gen_movhi_clr,
-    gen_movsi_clr,
-    gen_movdi_clr,
-    gen_movsf_clr,
-    gen_movdf_clr,
-    gen_movxf_clr,
-    gen_movti_clr,
-    gen_zero_extendqidi2_clr,
-    gen_zero_extendhidi2_clr,
-    gen_zero_extendsidi2_clr,
+  mode_no = get_mode_no_for_insn (insn);
 
-    gen_speculation_check_bi,
-    gen_speculation_check_qi,
-    gen_speculation_check_hi,
-    gen_speculation_check_si,
-    gen_speculation_check_di,
-    gen_speculation_check_sf,
-    gen_speculation_check_df,
-    gen_speculation_check_xf,
-    gen_speculation_check_ti,
-    gen_speculation_check_di,
-    gen_speculation_check_di,
-    gen_speculation_check_di,
+  if (mode_no != SPEC_MODE_INVALID)
+    {
+      if (ia64_get_insn_spec_ds (insn) == ds_get_speculation_types (ts))
+	res = 0;
+      else
+	{
+	  res = 1;
+	  *new_pat = ia64_gen_spec_load (insn, ts, mode_no);
+	}
+    }
+  else
+    res = -1;
 
-    gen_movbi_clr,
-    gen_movqi_clr,
-    gen_movhi_clr,
-    gen_movsi_clr,
-    gen_movdi_clr,
-    gen_movsf_clr,
-    gen_movdf_clr,
-    gen_movxf_clr,
-    gen_movti_clr,
-    gen_zero_extendqidi2_clr,
-    gen_zero_extendhidi2_clr,
-    gen_zero_extendsidi2_clr,
+  return res;
+}
 
+/* Return a function that will generate a check for speculation TS with mode
+   MODE_NO.
+   If simple check is needed, pass true for SIMPLE_CHECK_P.
+   If clearing check is needed, pass true for CLEARING_CHECK_P.  */
+static gen_func_t
+get_spec_check_gen_function (ds_t ts, int mode_no,
+			     bool simple_check_p, bool clearing_check_p)
+{
+  static gen_func_t gen_ld_c_clr[] = {
     gen_movbi_clr,
     gen_movqi_clr,
     gen_movhi_clr,
@@ -7200,36 +7561,22 @@ ia64_gen_check (rtx insn, rtx label, bool mutate_p)
     gen_zero_extendqidi2_clr,
     gen_zero_extendhidi2_clr,
     gen_zero_extendsidi2_clr,
-
-    gen_advanced_load_check_clr_bi,
-    gen_advanced_load_check_clr_qi,
-    gen_advanced_load_check_clr_hi,
-    gen_advanced_load_check_clr_si,
-    gen_advanced_load_check_clr_di,
-    gen_advanced_load_check_clr_sf,
-    gen_advanced_load_check_clr_df,
-    gen_advanced_load_check_clr_xf,
-    gen_advanced_load_check_clr_ti,
-    gen_advanced_load_check_clr_di,
-    gen_advanced_load_check_clr_di,
-    gen_advanced_load_check_clr_di,
-
-    /* Following checks are generated during mutation.  */
-    gen_advanced_load_check_clr_bi,
-    gen_advanced_load_check_clr_qi,
-    gen_advanced_load_check_clr_hi,
-    gen_advanced_load_check_clr_si,
-    gen_advanced_load_check_clr_di,
-    gen_advanced_load_check_clr_sf,
-    gen_advanced_load_check_clr_df,
-    gen_advanced_load_check_clr_xf,
-    gen_advanced_load_check_clr_ti,
-    gen_advanced_load_check_clr_di,
-    gen_advanced_load_check_clr_di,
-    gen_advanced_load_check_clr_di,
-
-    0,0,0,0,0,0,0,0,0,0,0,0,
-
+  };
+  static gen_func_t gen_ld_c_nc[] = {
+    gen_movbi_nc,
+    gen_movqi_nc,
+    gen_movhi_nc,
+    gen_movsi_nc,
+    gen_movdi_nc,
+    gen_movsf_nc,
+    gen_movdf_nc,
+    gen_movxf_nc,
+    gen_movti_nc,
+    gen_zero_extendqidi2_nc,
+    gen_zero_extendhidi2_nc,
+    gen_zero_extendsidi2_nc,
+  };
+  static gen_func_t gen_chk_a_clr[] = {
     gen_advanced_load_check_clr_bi,
     gen_advanced_load_check_clr_qi,
     gen_advanced_load_check_clr_hi,
@@ -7242,7 +7589,22 @@ ia64_gen_check (rtx insn, rtx label, bool mutate_p)
     gen_advanced_load_check_clr_di,
     gen_advanced_load_check_clr_di,
     gen_advanced_load_check_clr_di,
-
+  };
+  static gen_func_t gen_chk_a_nc[] = {
+    gen_advanced_load_check_nc_bi,
+    gen_advanced_load_check_nc_qi,
+    gen_advanced_load_check_nc_hi,
+    gen_advanced_load_check_nc_si,
+    gen_advanced_load_check_nc_di,
+    gen_advanced_load_check_nc_sf,
+    gen_advanced_load_check_nc_df,
+    gen_advanced_load_check_nc_xf,
+    gen_advanced_load_check_nc_ti,
+    gen_advanced_load_check_nc_di,
+    gen_advanced_load_check_nc_di,
+    gen_advanced_load_check_nc_di,
+  };
+  static gen_func_t gen_chk_s[] = {
     gen_speculation_check_bi,
     gen_speculation_check_qi,
     gen_speculation_check_hi,
@@ -7254,49 +7616,96 @@ ia64_gen_check (rtx insn, rtx label, bool mutate_p)
     gen_speculation_check_ti,
     gen_speculation_check_di,
     gen_speculation_check_di,
-    gen_speculation_check_di
+    gen_speculation_check_di,
   };
 
-  extract_insn_cached (insn);
+  gen_func_t *gen_check;
 
-  if (label)
-    {
-      gcc_assert (mutate_p || ia64_needs_block_p (insn));
-      op1 = label;
-    }
-  else
+  if (ts & BEGIN_DATA)
     {
-      gcc_assert (!mutate_p && !ia64_needs_block_p (insn));
-      op1 = copy_rtx (recog_data.operand[1]);
+      /* We don't need recovery because even if this is ld.sa
+	 ALAT entry will be allocated only if NAT bit is set to zero.
+	 So it is enough to use ld.c here.  */
+
+      if (simple_check_p)
+	{
+	  gcc_assert (mflag_sched_spec_ldc);
+
+	  if (clearing_check_p)
+	    gen_check = gen_ld_c_clr;
+	  else
+	    gen_check = gen_ld_c_nc;
+	}
+      else
+	{
+	  if (clearing_check_p)
+	    gen_check = gen_chk_a_clr;
+	  else
+	    gen_check = gen_chk_a_nc;
+	}
     }
-      
-  if (mutate_p)
-    /* INSN is ld.c.
-       Find the speculation check number by searching for original
-       speculative load in the RESOLVED_DEPS list of INSN.
-       As long as patterns are unique for each instruction, this can be
-       accomplished by matching ORIG_PAT fields.  */
+  else if (ts & BEGIN_CONTROL)
     {
-      sd_iterator_def sd_it;
-      dep_t dep;
-      int check_no = 0;
-      rtx orig_pat = ORIG_PAT (insn);
-
-      FOR_EACH_DEP (insn, SD_LIST_RES_BACK, sd_it, dep)
+      if (simple_check_p)
+	/* We might want to use ld.sa -> ld.c instead of
+	   ld.s -> chk.s.  */
 	{
-	  rtx x = DEP_PRO (dep);
+	  gcc_assert (!ia64_needs_block_p (ts));
 
-	  if (ORIG_PAT (x) == orig_pat)
-	    check_no = spec_check_no[INSN_UID (x)];
+	  if (clearing_check_p)
+	    gen_check = gen_ld_c_clr;
+	  else
+	    gen_check = gen_ld_c_nc;
+	}
+      else
+	{
+	  gen_check = gen_chk_s;
 	}
-      gcc_assert (check_no);
+    }
+  else
+    gcc_unreachable ();
+
+  gcc_assert (mode_no >= 0);
+  return gen_check[mode_no];
+}
+
+/* Return nonzero, if INSN needs branchy recovery check.  */
+static bool
+ia64_needs_block_p (ds_t ts)
+{
+  if (ts & BEGIN_DATA)
+    return !mflag_sched_spec_ldc;
+
+  gcc_assert ((ts & BEGIN_CONTROL) != 0);
+
+  return !(mflag_sched_spec_control_ldc && mflag_sched_spec_ldc);
+}
+
+/* Generate (or regenerate, if (MUTATE_P)) recovery check for INSN.
+   If (LABEL != 0 || MUTATE_P), generate branchy recovery check.
+   Otherwise, generate a simple check.  */
+static rtx
+ia64_gen_spec_check (rtx insn, rtx label, ds_t ds)
+{
+  rtx op1, pat, check_pat;
+  gen_func_t gen_check;
+  int mode_no;
 
-      spec_check_no[INSN_UID (insn)] = (check_no
-					+ SPEC_GEN_CHECK_MUTATION_OFFSET);
+  mode_no = get_mode_no_for_insn (insn);
+  gcc_assert (mode_no >= 0);
+
+  if (label)
+    op1 = label;
+  else
+    {
+      gcc_assert (!ia64_needs_block_p (ds));
+      op1 = copy_rtx (recog_data.operand[1]);
     }
+      
+  gen_check = get_spec_check_gen_function (ds, mode_no, label == NULL_RTX,
+					   true);
 
-  check_pat = (gen_check[spec_check_no[INSN_UID (insn)] - 1]
-	       (copy_rtx (recog_data.operand[0]), op1));
+  check_pat = gen_check (copy_rtx (recog_data.operand[0]), op1);
     
   pat = PATTERN (insn);
   if (GET_CODE (pat) == COND_EXEC)
@@ -7337,9 +7746,11 @@ ia64_spec_check_src_p (rtx src)
 	      
 	      code = XINT (t, 1);
 	     
-	      if (code == UNSPEC_CHKACLR
-		  || code == UNSPEC_CHKS
-		  || code == UNSPEC_LDCCLR)
+	      if (code == UNSPEC_LDCCLR
+		  || code == UNSPEC_LDCNC
+		  || code == UNSPEC_CHKACLR
+		  || code == UNSPEC_CHKANC
+		  || code == UNSPEC_CHKS)
 		{
 		  gcc_assert (code != 0);
 		  return code;
@@ -7370,6 +7781,7 @@ struct bundle_state
   int accumulated_insns_num; /* number of all previous insns including
 				nops.  L is considered as 2 insns */
   int branch_deviation; /* deviation of previous branches from 3rd slots  */
+  int middle_bundle_stops; /* number of stop bits in the middle of bundles */
   struct bundle_state *next;  /* next state with the same insn_num  */
   struct bundle_state *originator; /* originator (previous insn state)  */
   /* All bundle states are in the following chain.  */
@@ -7515,9 +7927,15 @@ insert_bundle_state (struct bundle_state *bundle_state)
 		   || (((struct bundle_state *)
 			*entry_ptr)->accumulated_insns_num
 		       == bundle_state->accumulated_insns_num
-		       && ((struct bundle_state *)
-			   *entry_ptr)->branch_deviation
-		       > bundle_state->branch_deviation))))
+		       && (((struct bundle_state *)
+			    *entry_ptr)->branch_deviation
+			   > bundle_state->branch_deviation
+			   || (((struct bundle_state *)
+				*entry_ptr)->branch_deviation
+			       == bundle_state->branch_deviation
+			       && ((struct bundle_state *)
+				   *entry_ptr)->middle_bundle_stops
+			       > bundle_state->middle_bundle_stops))))))
 
     {
       struct bundle_state temp;
@@ -7611,6 +8029,7 @@ issue_nops_and_insn (struct bundle_state *originator, int before_nops_num,
   curr_state->accumulated_insns_num
     = originator->accumulated_insns_num + before_nops_num;
   curr_state->branch_deviation = originator->branch_deviation;
+  curr_state->middle_bundle_stops = originator->middle_bundle_stops;
   gcc_assert (insn);
   if (INSN_CODE (insn) == CODE_FOR_insn_group_barrier)
     {
@@ -7620,6 +8039,8 @@ issue_nops_and_insn (struct bundle_state *originator, int before_nops_num,
       if (!try_issue_insn (curr_state, insn))
 	return;
       memcpy (temp_dfa_state, curr_state->dfa_state, dfa_state_size);
+      if (curr_state->accumulated_insns_num % 3 != 0)
+	curr_state->middle_bundle_stops++;
       if (state_transition (temp_dfa_state, dfa_pre_cycle_insn) >= 0
 	  && curr_state->accumulated_insns_num % 3 != 0)
 	{
@@ -7795,6 +8216,16 @@ get_template (state_t state, int pos)
     }
 }
 
+/* True when INSN is important for bundling.  */
+static bool
+important_for_bundling_p (rtx insn)
+{
+  return (INSN_P (insn)
+	  && ia64_safe_itanium_class (insn) != ITANIUM_CLASS_IGNORE
+	  && GET_CODE (PATTERN (insn)) != USE
+	  && GET_CODE (PATTERN (insn)) != CLOBBER);
+}
+
 /* The following function returns an insn important for insn bundling
    followed by INSN and before TAIL.  */
 
@@ -7802,10 +8233,7 @@ static rtx
 get_next_important_insn (rtx insn, rtx tail)
 {
   for (; insn && insn != tail; insn = NEXT_INSN (insn))
-    if (INSN_P (insn)
-	&& ia64_safe_itanium_class (insn) != ITANIUM_CLASS_IGNORE
-	&& GET_CODE (PATTERN (insn)) != USE
-	&& GET_CODE (PATTERN (insn)) != CLOBBER)
+    if (important_for_bundling_p (insn))
       return insn;
   return NULL_RTX;
 }
@@ -7930,6 +8358,7 @@ bundling (FILE *dump, int verbose, rtx prev_head_insn, rtx tail)
   curr_state->cost = 0;
   curr_state->accumulated_insns_num = 0;
   curr_state->branch_deviation = 0;
+  curr_state->middle_bundle_stops = 0;
   curr_state->next = NULL;
   curr_state->originator = NULL;
   state_reset (curr_state->dfa_state);
@@ -7952,7 +8381,8 @@ bundling (FILE *dump, int verbose, rtx prev_head_insn, rtx tail)
 	  if (INSN_P (next_insn)
 	      && ia64_safe_itanium_class (next_insn) != ITANIUM_CLASS_IGNORE
 	      && GET_CODE (PATTERN (next_insn)) != USE
-	      && GET_CODE (PATTERN (next_insn)) != CLOBBER)
+	      && GET_CODE (PATTERN (next_insn)) != CLOBBER
+	      && INSN_CODE (next_insn) != CODE_FOR_insn_group_barrier)
 	    {
 	      PUT_MODE (next_insn, TImode);
 	      break;
@@ -8026,13 +8456,14 @@ bundling (FILE *dump, int verbose, rtx prev_head_insn, rtx tail)
 
 	    fprintf
 	      (dump,
-	       "//    Bundle state %d (orig %d, cost %d, nops %d/%d, insns %d, branch %d, state %d) for %d\n",
+	       "//    Bundle state %d (orig %d, cost %d, nops %d/%d, insns %d, branch %d, mid.stops %d state %d) for %d\n",
 	       curr_state->unique_num,
 	       (curr_state->originator == NULL
 		? -1 : curr_state->originator->unique_num),
 	       curr_state->cost,
 	       curr_state->before_nops_num, curr_state->after_nops_num,
 	       curr_state->accumulated_insns_num, curr_state->branch_deviation,
+	       curr_state->middle_bundle_stops,
 	       (ia64_tune == PROCESSOR_ITANIUM
 		? ((struct DFA_chip *) curr_state->dfa_state)->oneb_automaton_state
 		: ((struct DFA_chip *) curr_state->dfa_state)->twob_automaton_state),
@@ -8059,10 +8490,15 @@ bundling (FILE *dump, int verbose, rtx prev_head_insn, rtx tail)
 		    < best_state->accumulated_insns_num
 		    || (curr_state->accumulated_insns_num
 			== best_state->accumulated_insns_num
-			&& curr_state->branch_deviation
-			< best_state->branch_deviation)))))
+			&& (curr_state->branch_deviation
+			    < best_state->branch_deviation
+			    || (curr_state->branch_deviation
+				== best_state->branch_deviation
+				&& curr_state->middle_bundle_stops
+				< best_state->middle_bundle_stops)))))))
       best_state = curr_state;
   /* Second (backward) pass: adding nops and templates.  */
+  gcc_assert (best_state);
   insn_num = best_state->before_nops_num;
   template0 = template1 = -1;
   for (curr_state = best_state;
@@ -8085,13 +8521,14 @@ bundling (FILE *dump, int verbose, rtx prev_head_insn, rtx tail)
 
 	  fprintf
 	    (dump,
-	     "//    Best %d (orig %d, cost %d, nops %d/%d, insns %d, branch %d, state %d) for %d\n",
+	     "//    Best %d (orig %d, cost %d, nops %d/%d, insns %d, branch %d, mid.stops %d, state %d) for %d\n",
 	     curr_state->unique_num,
 	     (curr_state->originator == NULL
 	      ? -1 : curr_state->originator->unique_num),
 	     curr_state->cost,
 	     curr_state->before_nops_num, curr_state->after_nops_num,
 	     curr_state->accumulated_insns_num, curr_state->branch_deviation,
+	     curr_state->middle_bundle_stops,
 	     (ia64_tune == PROCESSOR_ITANIUM
 	      ? ((struct DFA_chip *) curr_state->dfa_state)->oneb_automaton_state
 	      : ((struct DFA_chip *) curr_state->dfa_state)->twob_automaton_state),
@@ -8290,6 +8727,57 @@ bundling (FILE *dump, int verbose, rtx prev_head_insn, rtx tail)
 				     insn);
 	  }
       }
+
+#ifdef ENABLE_CHECKING
+  {
+    /* Assert right calculation of middle_bundle_stops.  */
+    int num = best_state->middle_bundle_stops;
+    bool start_bundle = true, end_bundle = false;
+
+    for (insn = NEXT_INSN (prev_head_insn);
+	 insn && insn != tail;
+	 insn = NEXT_INSN (insn))
+      {
+	if (!INSN_P (insn))
+	  continue;
+	if (recog_memoized (insn) == CODE_FOR_bundle_selector)
+	  start_bundle = true;
+	else
+	  {
+	    rtx next_insn;
+
+	    for (next_insn = NEXT_INSN (insn);
+		 next_insn && next_insn != tail;
+		 next_insn = NEXT_INSN (next_insn))
+	      if (INSN_P (next_insn)
+		  && (ia64_safe_itanium_class (next_insn)
+		      != ITANIUM_CLASS_IGNORE
+		      || recog_memoized (next_insn)
+		      == CODE_FOR_bundle_selector)
+		  && GET_CODE (PATTERN (next_insn)) != USE
+		  && GET_CODE (PATTERN (next_insn)) != CLOBBER)
+		break;
+
+	    end_bundle = next_insn == NULL_RTX
+	     || next_insn == tail
+	     || (INSN_P (next_insn)
+		 && recog_memoized (next_insn)
+		 == CODE_FOR_bundle_selector);
+	    if (recog_memoized (insn) == CODE_FOR_insn_group_barrier
+		&& !start_bundle && !end_bundle
+		&& next_insn
+		&& GET_CODE (PATTERN (next_insn)) != ASM_INPUT
+		&& asm_noperands (PATTERN (next_insn)) < 0)
+	      num--;
+
+	    start_bundle = false;
+	  }
+      }
+
+    gcc_assert (num == 0);
+  }
+#endif
+
   free (index_to_bundle_states);
   finish_bundle_state_table ();
   bundling_p = 0;
@@ -8327,6 +8815,7 @@ final_emit_insn_group_barriers (FILE *dump ATTRIBUTE_UNUSED)
 {
   rtx insn;
   int need_barrier_p = 0;
+  int seen_good_insn = 0;
   rtx prev_insn = NULL_RTX;
 
   init_insn_group_barriers ();
@@ -8348,6 +8837,7 @@ final_emit_insn_group_barriers (FILE *dump ATTRIBUTE_UNUSED)
 	    emit_insn_after (gen_insn_group_barrier (GEN_INT (3)), last);
 
 	  init_insn_group_barriers ();
+	  seen_good_insn = 0;
 	  need_barrier_p = 0;
 	  prev_insn = NULL_RTX;
 	}
@@ -8356,10 +8846,14 @@ final_emit_insn_group_barriers (FILE *dump ATTRIBUTE_UNUSED)
 	  if (recog_memoized (insn) == CODE_FOR_insn_group_barrier)
 	    {
 	      init_insn_group_barriers ();
+	      seen_good_insn = 0;
 	      need_barrier_p = 0;
 	      prev_insn = NULL_RTX;
 	    }
-	  else if (need_barrier_p || group_barrier_needed (insn))
+	  else if (need_barrier_p || group_barrier_needed (insn)
+		   || (mflag_sched_stop_bits_after_every_cycle
+		       && GET_MODE (insn) == TImode
+		       && seen_good_insn))
 	    {
 	      if (TARGET_EARLY_STOP_BITS)
 		{
@@ -8383,19 +8877,32 @@ final_emit_insn_group_barriers (FILE *dump ATTRIBUTE_UNUSED)
 		       last != insn;
 		       last = NEXT_INSN (last))
 		    if (INSN_P (last))
-		      group_barrier_needed (last);
+		      {
+			group_barrier_needed (last);
+			if (recog_memoized (last) >= 0
+			    && important_for_bundling_p (last))
+			  seen_good_insn = 1;
+		      }
 		}
 	      else
 		{
 		  emit_insn_before (gen_insn_group_barrier (GEN_INT (3)),
 				    insn);
 		  init_insn_group_barriers ();
+		  seen_good_insn = 0;
 		}
 	      group_barrier_needed (insn);
+	      if (recog_memoized (insn) >= 0
+		  && important_for_bundling_p (insn))
+		seen_good_insn = 1;
 	      prev_insn = NULL_RTX;
 	    }
-	  else if (recog_memoized (insn) >= 0)
-	    prev_insn = insn;
+	  else if (recog_memoized (insn) >= 0
+		   && important_for_bundling_p (insn))
+	    {
+	      prev_insn = insn;
+	      seen_good_insn = 1;
+	    }
 	  need_barrier_p = (GET_CODE (insn) == CALL_INSN
 			    || GET_CODE (PATTERN (insn)) == ASM_INPUT
 			    || asm_noperands (PATTERN (insn)) >= 0);
@@ -8504,7 +9011,8 @@ ia64_ld_address_bypass_p (rtx producer, rtx consumer)
     {
       int c = XINT (mem, 1);
 
-      gcc_assert (c == UNSPEC_LDA || c == UNSPEC_LDS || c == UNSPEC_LDSA);
+      gcc_assert (c == UNSPEC_LDA || c == UNSPEC_LDS || c == UNSPEC_LDS_A
+		  || c == UNSPEC_LDSA);
       mem = XVECEXP (mem, 0, 0);
     }
 
@@ -8601,7 +9109,8 @@ ia64_reorg (void)
   if (optimize == 0)
     split_all_insns ();
 
-  if (optimize && ia64_flag_schedule_insns2 && dbg_cnt (ia64_sched2))
+  if (optimize && ia64_flag_schedule_insns2
+      && dbg_cnt (ia64_sched2))
     {
       timevar_push (TV_SCHED2);
       ia64_final_schedule = 1;
@@ -8675,7 +9184,16 @@ ia64_reorg (void)
 	  _1mfb_ = get_cpu_unit_code ("1b_1mfb.");
 	  _1mlx_ = get_cpu_unit_code ("1b_1mlx.");
 	}
-      schedule_ebbs ();
+
+      if (flag_selective_scheduling2
+	  && !maybe_skip_selective_scheduling ())
+        run_selective_scheduling ();
+      else
+	schedule_ebbs ();
+
+      /* Redo alignment computation, as it might gone wrong.  */
+      compute_alignments ();
+
       /* We cannot reuse this one because it has been corrupted by the
 	 evil glat.  */
       finish_bundle_states ();
@@ -9932,6 +10450,7 @@ ia64_optimization_options (int level ATTRIBUTE_UNUSED,
   set_param_value ("simultaneous-prefetches", 6);
   set_param_value ("l1-cache-line-size", 32);
 
+  set_param_value("sched-mem-true-dep-cost", 4);
 }
 
 /* HP-UX version_id attribute.
diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index cf746f5..b03032d 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -82,12 +82,15 @@
    (UNSPEC_VECT_EXTR		31)
    (UNSPEC_LDA                  40)
    (UNSPEC_LDS                  41)
-   (UNSPEC_LDSA                 42)
-   (UNSPEC_LDCCLR               43)
-   (UNSPEC_CHKACLR              45)
-   (UNSPEC_CHKS                 47)	
-   (UNSPEC_FR_RECIP_APPROX_RES  48)
-   (UNSPEC_FR_SQRT_RECIP_APPROX_RES 49)
+   (UNSPEC_LDS_A                42)
+   (UNSPEC_LDSA                 43)
+   (UNSPEC_LDCCLR               44)
+   (UNSPEC_LDCNC                45)
+   (UNSPEC_CHKACLR              46)
+   (UNSPEC_CHKANC               47)
+   (UNSPEC_CHKS                 48)
+   (UNSPEC_FR_RECIP_APPROX_RES  49)
+   (UNSPEC_FR_SQRT_RECIP_APPROX_RES 50)
   ])
 
 (define_constants
@@ -185,6 +188,10 @@
 (define_attr "control_speculative" "no,yes" (const_string "no"))
 
 (define_attr "check_load" "no,yes" (const_string "no"))
+
+(define_attr "speculable1" "no,yes" (const_string "no"))
+
+(define_attr "speculable2" "no,yes" (const_string "no"))
 
 ;; DFA descriptions of ia64 processors used for insn scheduling and
 ;; bundling.
@@ -234,7 +241,9 @@
    ld1%O1 %0 = %1%P1
    st1%Q0 %0 = %1%P0
    mov %0 = %1"
-  [(set_attr "itanium_class" "icmp,icmp,unknown,unknown,tbit,ialu,ld,st,ialu")])
+  [(set_attr "itanium_class" "icmp,icmp,unknown,unknown,tbit,ialu,ld,st,ialu")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no,  no,  no,     no,     no,  no, yes,no,no")])
 
 (define_split
   [(set (match_operand:BI 0 "register_operand" "")
@@ -273,7 +282,7 @@
   operands[1] = op1;
 })
 
-(define_insn "*movqi_internal"
+(define_insn "movqi_internal"
   [(set (match_operand:QI 0 "destination_operand" "=r,r,r, m, r,*f,*f")
 	(match_operand:QI 1 "move_operand"        "rO,J,m,rO,*f,rO,*f"))]
   "ia64_move_ok (operands[0], operands[1])"
@@ -285,7 +294,9 @@
    getf.sig %0 = %1
    setf.sig %0 = %r1
    mov %0 = %1"
-  [(set_attr "itanium_class" "ialu,ialu,ld,st,frfr,tofr,fmisc")])
+  [(set_attr "itanium_class" "ialu,ialu,ld,st,frfr,tofr,fmisc")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no,  no, yes,no,no,  no,  no")])
 
 (define_expand "movhi"
   [(set (match_operand:HI 0 "general_operand" "")
@@ -298,7 +309,7 @@
   operands[1] = op1;
 })
 
-(define_insn "*movhi_internal"
+(define_insn "movhi_internal"
   [(set (match_operand:HI 0 "destination_operand" "=r,r,r, m, r,*f,*f")
 	(match_operand:HI 1 "move_operand"        "rO,J,m,rO,*f,rO,*f"))]
   "ia64_move_ok (operands[0], operands[1])"
@@ -310,7 +321,9 @@
    getf.sig %0 = %1
    setf.sig %0 = %r1
    mov %0 = %1"
-  [(set_attr "itanium_class" "ialu,ialu,ld,st,frfr,tofr,fmisc")])
+  [(set_attr "itanium_class" "ialu,ialu,ld,st,frfr,tofr,fmisc")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no,  no, yes,no,no,  no,  no")])
 
 (define_expand "movsi"
   [(set (match_operand:SI 0 "general_operand" "")
@@ -323,7 +336,7 @@
   operands[1] = op1;
 })
 
-(define_insn "*movsi_internal"
+(define_insn "movsi_internal"
   [(set (match_operand:SI 0 "destination_operand" "=r,r,r,r,r, m, r,*f,*f, r,*d")
 	(match_operand:SI 1 "move_operand"        "rO,J,j,i,m,rO,*f,rO,*f,*d,rK"))]
   "ia64_move_ok (operands[0], operands[1])"
@@ -340,7 +353,9 @@
   mov %0 = %1
   mov %0 = %r1"
   ;; frar_m, toar_m ??? why not frar_i and toar_i
-  [(set_attr "itanium_class" "ialu,ialu,ialu,long_i,ld,st,frfr,tofr,fmisc,frar_m,toar_m")])
+  [(set_attr "itanium_class" "ialu,ialu,ialu,long_i,ld,st,frfr,tofr,fmisc,frar_m,toar_m")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no,  no,  no,  no,   yes,no,no,  no,  no,   no,    no")])
 
 (define_expand "movdi"
   [(set (match_operand:DI 0 "general_operand" "")
@@ -353,7 +368,7 @@
   operands[1] = op1;
 })
 
-(define_insn "*movdi_internal"
+(define_insn "movdi_internal"
   [(set (match_operand:DI 0 "destination_operand"
 		    "=r,r,r,r,r, m, r,*f,*f,*f, Q, r,*b, r,*e, r,*d, r,*c")
 	(match_operand:DI 1 "move_operand"
@@ -387,7 +402,9 @@
 
   return alt[which_alternative];
 }
-  [(set_attr "itanium_class" "ialu,ialu,ialu,long_i,ld,st,frfr,tofr,fmisc,fld,stf,frbr,tobr,frar_i,toar_i,frar_m,toar_m,frpr,topr")])
+  [(set_attr "itanium_class" "ialu,ialu,ialu,long_i,ld,st,frfr,tofr,fmisc,fld,stf,frbr,tobr,frar_i,toar_i,frar_m,toar_m,frpr,topr")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no,  no,  no,  no,   yes,no,no,  no,  no,   yes,no, no,  no,  no,    no,    no,    no,    no,  no")])
 
 (define_mode_iterator MODE [BI QI HI SI DI SF DF XF TI])
 (define_mode_iterator MODE_FOR_EXTEND [QI HI SI])
@@ -472,6 +489,26 @@
   (XF "ldfe.c.clr %0 = %1%P1")
   (TI "ldfp8.c.clr %X0 = %1%P1")])
 
+(define_mode_attr output_c_nc [
+  (BI "ld1.c.nc%O1 %0 = %1%P1")
+  (QI "ld1.c.nc%O1 %0 = %1%P1")
+  (HI "ld2.c.nc%O1 %0 = %1%P1")
+  (SI "ld4.c.nc%O1 %0 = %1%P1")
+  (DI
+   "@
+    ld8.c.nc%O1 %0 = %1%P1
+    ldf8.c.nc %0 = %1%P1")
+  (SF
+   "@
+    ldfs.c.nc %0 = %1%P1
+    ld4.c.nc%O1 %0 = %1%P1")
+  (DF
+   "@
+    ldfd.c.nc %0 = %1%P1
+    ld8.c.nc%O1 %0 = %1%P1")
+  (XF "ldfe.c.nc %0 = %1%P1")
+  (TI "ldfp8.c.nc %X0 = %1%P1")])
+
 (define_mode_attr ld_reg_constr [(BI "=*r") (QI "=r") (HI "=r") (SI "=r") (DI "=r,*f") (SF "=f,*r") (DF "=f,*r") (XF "=f") (TI "=*x")])
 (define_mode_attr ldc_reg_constr [(BI "+*r") (QI "+r") (HI "+r") (SI "+r") (DI "+r,*f") (SF "+f,*r") (DF "+f,*r") (XF "+f") (TI "+*x")])
 (define_mode_attr chk_reg_constr [(BI "*r") (QI "r") (HI "r") (SI "r") (DI "r,*f") (SF "f,*r") (DF "f,*r") (XF "f") (TI "*x")])
@@ -530,6 +567,15 @@
    (set_attr "data_speculative" "<attr_yes>")
    (set_attr "control_speculative" "<attr_yes>")])
 
+(define_insn "mov<mode>_speculative_a"
+  [(set (match_operand:MODE 0 "<reg_pred_prefix>_register_operand" "<ld_reg_constr>")
+	(unspec:MODE [(match_operand:MODE 1 "memory_operand" "<mem_constr>")] UNSPEC_LDS_A))]
+  "ia64_move_ok (operands[0], operands[1])"
+  "<output_sa>"
+  [(set_attr "itanium_class" "<ld_class>")
+   (set_attr "data_speculative" "<attr_yes>")
+   (set_attr "control_speculative" "<attr_yes>")])
+
 (define_insn "zero_extend<mode>di2_speculative_advanced"
   [(set (match_operand:DI 0 "gr_register_operand" "=r")
 	(zero_extend:DI (unspec:MODE_FOR_EXTEND [(match_operand:MODE_FOR_EXTEND 1 "memory_operand" "<mem_constr>")] UNSPEC_LDSA)))]
@@ -539,6 +585,15 @@
    (set_attr "data_speculative" "<attr_yes>")
    (set_attr "control_speculative" "<attr_yes>")])
 
+(define_insn "zero_extend<mode>di2_speculative_a"
+  [(set (match_operand:DI 0 "gr_register_operand" "=r")
+	(zero_extend:DI (unspec:MODE_FOR_EXTEND [(match_operand:MODE_FOR_EXTEND 1 "memory_operand" "<mem_constr>")] UNSPEC_LDS_A)))]
+  ""
+  "<output_sa>"
+  [(set_attr "itanium_class" "<ld_class>")
+   (set_attr "data_speculative" "<attr_yes>")
+   (set_attr "control_speculative" "<attr_yes>")])
+
 (define_insn "mov<mode>_clr"
   [(set (match_operand:MODE 0 "<reg_pred_prefix>_register_operand" "<ldc_reg_constr>")
 	(if_then_else:MODE (ne (unspec [(match_dup 0)] UNSPEC_LDCCLR) (const_int 0))
@@ -549,6 +604,16 @@
   [(set_attr "itanium_class" "<ld_class>")
    (set_attr "check_load" "<attr_yes>")])
 
+(define_insn "mov<mode>_nc"
+  [(set (match_operand:MODE 0 "<reg_pred_prefix>_register_operand" "<ldc_reg_constr>")
+	(if_then_else:MODE (ne (unspec [(match_dup 0)] UNSPEC_LDCNC) (const_int 0))
+			   (match_operand:MODE 1 "memory_operand" "<mem_constr>")
+			   (match_dup 0)))]
+  "ia64_move_ok (operands[0], operands[1])"
+  "<output_c_nc>"
+  [(set_attr "itanium_class" "<ld_class>")
+   (set_attr "check_load" "<attr_yes>")])
+
 (define_insn "zero_extend<mode>di2_clr"
   [(set (match_operand:DI 0 "gr_register_operand" "+r")
 	(if_then_else:DI (ne (unspec [(match_dup 0)] UNSPEC_LDCCLR) (const_int 0))
@@ -559,6 +624,16 @@
   [(set_attr "itanium_class" "<ld_class>")
    (set_attr "check_load" "<attr_yes>")])
 
+(define_insn "zero_extend<mode>di2_nc"
+  [(set (match_operand:DI 0 "gr_register_operand" "+r")
+	(if_then_else:DI (ne (unspec [(match_dup 0)] UNSPEC_LDCNC) (const_int 0))
+			 (zero_extend:DI (match_operand:MODE_FOR_EXTEND 1 "memory_operand" "<mem_constr>"))
+			 (match_dup 0)))]
+  ""
+  "<output_c_nc>"
+  [(set_attr "itanium_class" "<ld_class>")
+   (set_attr "check_load" "<attr_yes>")])
+
 (define_insn "advanced_load_check_clr_<mode>"
   [(set (pc)
         (if_then_else (ne (unspec [(match_operand:MODE 0 "<reg_pred_prefix>_register_operand" "<chk_reg_constr>")] UNSPEC_CHKACLR) (const_int 0))
@@ -568,6 +643,15 @@
   "chk.a.clr %0, %l1"
   [(set_attr "itanium_class" "<chka_class>")])
 
+(define_insn "advanced_load_check_nc_<mode>"
+  [(set (pc)
+        (if_then_else (ne (unspec [(match_operand:MODE 0 "<reg_pred_prefix>_register_operand" "<chk_reg_constr>")] UNSPEC_CHKANC) (const_int 0))
+                      (pc)
+                      (label_ref (match_operand 1 "" ""))))]
+  ""
+  "chk.a.clr %0, %l1"
+  [(set_attr "itanium_class" "<chka_class>")])
+
 (define_insn "speculation_check_<mode>"
   [(set (pc) 
         (if_then_else (ne (unspec [(match_operand:MODE 0 "<reg_pred_prefix>_register_operand" "<chk_reg_constr>")] UNSPEC_CHKS) (const_int 0))
@@ -863,7 +947,7 @@
   operands[1] = op1;
 })
 
-(define_insn_and_split "*movti_internal"
+(define_insn_and_split "movti_internal"
   [(set (match_operand:TI 0 "destination_operand" "=r,   *fm,*x,*f,  Q")
 	(match_operand:TI 1 "general_operand"     "r*fim,r,  Q, *fOQ,*f"))]
   "ia64_move_ok (operands[0], operands[1])"
@@ -879,7 +963,9 @@
   ia64_split_tmode_move (operands);
   DONE;
 }
-  [(set_attr "itanium_class" "unknown,unknown,fldp,unknown,unknown")])
+  [(set_attr "itanium_class" "unknown,unknown,fldp,unknown,unknown")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no,     no,     yes, no,     no")])
 
 ;; Floating Point Moves
 ;;
@@ -897,7 +983,7 @@
   operands[1] = op1;
 })
 
-(define_insn "*movsf_internal"
+(define_insn "movsf_internal"
   [(set (match_operand:SF 0 "destination_operand" "=f,f, Q,*r, f,*r,*r, m")
 	(match_operand:SF 1 "general_operand"     "fG,Q,fG,fG,*r,*r, m,*r"))]
   "ia64_move_ok (operands[0], operands[1])"
@@ -910,7 +996,9 @@
    mov %0 = %1
    ld4%O1 %0 = %1%P1
    st4%Q0 %0 = %1%P0"
-  [(set_attr "itanium_class" "fmisc,fld,stf,frfr,tofr,ialu,ld,st")])
+  [(set_attr "itanium_class" "fmisc,fld,stf,frfr,tofr,ialu,ld,st")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no,   yes,no, no,  no,  no, yes,no")])
 
 (define_expand "movdf"
   [(set (match_operand:DF 0 "general_operand" "")
@@ -923,7 +1011,7 @@
   operands[1] = op1;
 })
 
-(define_insn "*movdf_internal"
+(define_insn "movdf_internal"
   [(set (match_operand:DF 0 "destination_operand" "=f,f, Q,*r, f,*r,*r, m")
 	(match_operand:DF 1 "general_operand"     "fG,Q,fG,fG,*r,*r, m,*r"))]
   "ia64_move_ok (operands[0], operands[1])"
@@ -936,7 +1024,9 @@
    mov %0 = %1
    ld8%O1 %0 = %1%P1
    st8%Q0 %0 = %1%P0"
-  [(set_attr "itanium_class" "fmisc,fld,stf,frfr,tofr,ialu,ld,st")])
+  [(set_attr "itanium_class" "fmisc,fld,stf,frfr,tofr,ialu,ld,st")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no,   yes,no, no,  no,  no, yes,no")])
 
 ;; With no offsettable memory references, we've got to have a scratch
 ;; around to play with the second word if the variable winds up in GRs.
@@ -951,7 +1041,7 @@
 
 ;; ??? There's no easy way to mind volatile acquire/release semantics.
 
-(define_insn "*movxf_internal"
+(define_insn "movxf_internal"
   [(set (match_operand:XF 0 "destination_operand" "=f,f, m")
 	(match_operand:XF 1 "general_operand"     "fG,m,fG"))]
   "ia64_move_ok (operands[0], operands[1])"
@@ -959,7 +1049,9 @@
    mov %0 = %F1
    ldfe %0 = %1%P1
    stfe %0 = %F1%P0"
-  [(set_attr "itanium_class" "fmisc,fld,stf")])
+  [(set_attr "itanium_class" "fmisc,fld,stf")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no,   yes,no")])
 
 ;; Same as for movxf, but for RFmode.
 (define_expand "movrf"
@@ -1049,7 +1141,9 @@
   "@
    zxt1 %0 = %1
    ld1%O1 %0 = %1%P1"
-  [(set_attr "itanium_class" "xtd,ld")])
+  [(set_attr "itanium_class" "xtd,ld")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no, yes")])
 
 (define_insn "zero_extendhidi2"
   [(set (match_operand:DI 0 "gr_register_operand" "=r,r")
@@ -1058,7 +1152,9 @@
   "@
    zxt2 %0 = %1
    ld2%O1 %0 = %1%P1"
-  [(set_attr "itanium_class" "xtd,ld")])
+  [(set_attr "itanium_class" "xtd,ld")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no, yes")])
 
 (define_insn "zero_extendsidi2"
   [(set (match_operand:DI 0 "grfr_register_operand" "=r,r,?f")
@@ -1069,7 +1165,9 @@
    addp4 %0 = %1, r0
    ld4%O1 %0 = %1%P1
    fmix.r %0 = f0, %1"
-  [(set_attr "itanium_class" "ialu,ld,fmisc")])
+  [(set_attr "itanium_class" "ialu,ld,fmisc")
+   (set_attr "speculable1"   "yes")
+   (set_attr "speculable2"   "no, yes,no")])
 
 ;; Convert between floating point types of different sizes.
 
diff --git a/gcc/config/ia64/ia64.opt b/gcc/config/ia64/ia64.opt
index 2ff3d59..f1da7f3 100644
--- a/gcc/config/ia64/ia64.opt
+++ b/gcc/config/ia64/ia64.opt
@@ -105,7 +105,7 @@ Target Report Var(mflag_sched_ar_data_spec) Init(1)
 Use data speculation after reload
 
 msched-control-spec
-Target Report Var(mflag_sched_control_spec) Init(0)
+Target Report Var(mflag_sched_control_spec) Init(2)
 Use control speculation
 
 msched-br-in-data-spec
@@ -120,28 +120,44 @@ msched-in-control-spec
 Target Report Var(mflag_sched_in_control_spec) Init(1)
 Use in block control speculation
 
-msched-ldc
-Target Report Var(mflag_sched_ldc) Init(1)
+msched-spec-ldc
+Target Report Var(mflag_sched_spec_ldc) Init(1)
 Use simple data speculation check
 
-msched-control-ldc
-Target Report Var(mflag_control_ldc) Init(0)
+msched-spec-control-ldc
+Target Report Var(mflag_sched_spec_control_ldc) Init(0)
 Use simple data speculation check for control speculation
 
-msched-spec-verbose
-Common Report Var(mflag_sched_spec_verbose) Init(0)
-Print information about speculative motions.
-
 msched-prefer-non-data-spec-insns
-Common Report Var(mflag_sched_prefer_non_data_spec_insns) Init(0)
+Target Report Var(mflag_sched_prefer_non_data_spec_insns) Init(0)
 If set, data speculative instructions will be chosen for schedule only if there are no other choices at the moment 
 
 msched-prefer-non-control-spec-insns
-Common Report Var(mflag_sched_prefer_non_control_spec_insns) Init(0)
+Target Report Var(mflag_sched_prefer_non_control_spec_insns) Init(0)
 If set, control speculative instructions will be chosen for schedule only if there are no other choices at the moment 
 
 msched-count-spec-in-critical-path
-Common Report Var(mflag_sched_count_spec_in_critical_path) Init(0)
+Target Report Var(mflag_sched_count_spec_in_critical_path) Init(0)
 Count speculative dependencies while calculating priority of instructions
 
+msched-stop-bits-after-every-cycle
+Target Report Var(mflag_sched_stop_bits_after_every_cycle) Init(1)
+Place a stop bit after every cycle when scheduling
+
+msched-fp-mem-deps-zero-cost
+Target Report Var(mflag_sched_fp_mem_deps_zero_cost) Init(0)
+Assume that floating-point stores and loads are not likely to cause conflict when placed into one instruction group
+
+msched-max-memory-insns=
+Target RejectNegative Joined UInteger Var(ia64_max_memory_insns) Init(1)
+Soft limit on number of memory insns per instruction group, giving lower priority to subsequent memory insns attempting to schedule in the same insn group. Frequently useful to prevent cache bank conflicts.  Default value is 1
+
+msched-max-memory-insns-hard-limit
+Target Report Var(mflag_sched_mem_insns_hard_limit) Init(0)
+Disallow more than `msched-max-memory-insns' in instruction group. Otherwise, limit is `soft' (prefer non-memory operations when limit is reached)
+
+msel-sched-dont-check-control-spec
+Target Report Var(mflag_sel_sched_dont_check_control_spec) Init(0)
+Don't generate checks for control speculation in selective scheduling
+
 ; This comment is to ensure we retain the blank line above.
diff --git a/gcc/config/ia64/itanium2.md b/gcc/config/ia64/itanium2.md
index e8690dc..26dbb50 100644
--- a/gcc/config/ia64/itanium2.md
+++ b/gcc/config/ia64/itanium2.md
@@ -1072,14 +1072,13 @@
 (define_bypass  3 "2_ialu" "2_mmalua,2_mmmul,2_mmshf")
 (define_bypass  3 "2_mmalua,2_mmmul,2_mmshf" "2_ialu,2_ilog,2_ishf,2_st,2_ld,2_ldc")
 (define_bypass  6 "2_tofr"  "2_frfr,2_stf")
-(define_bypass  7 "2_fmac"  "2_frfr,2_stf")
 
 ;; We don't use here fcmp because scall may be predicated.
 (define_bypass  0 "2_fcvtfx,2_fld,2_flda,2_fldc,2_fmac,2_fmisc,2_frar_i,2_frar_m,\
                    2_frbr,2_frfr,2_frpr,2_ialu,2_ilog,2_ishf,2_ld,2_ldc,2_long_i,\
-                   2_mmalua,2_mmmul,2_mmshf,2_mmshfi,2_toar_m,2_tofr,\
+                   2_mmalua,2_mmmul,2_mmshf,2_mmshfi,2_toar_m,2_tobr,2_tofr,\
 		   2_xmpy,2_xtd"
-                  "2_scall")
+                  "2_br,2_scall")
 
 (define_bypass  0 "2_unknown,2_ignore,2_stop_bit,2_br,2_fcmp,2_fcvtfx,2_fld,2_flda,2_fldc,\
                    2_fmac,2_fmisc,2_frar_i,2_frar_m,2_frbr,2_frfr,2_frpr,\
diff --git a/gcc/config/ia64/t-ia64 b/gcc/config/ia64/t-ia64
index 10a5cf3..43f3d8e 100644
--- a/gcc/config/ia64/t-ia64
+++ b/gcc/config/ia64/t-ia64
@@ -33,4 +33,4 @@ ia64-c.o: $(srcdir)/config/ia64/ia64-c.c $(CONFIG_H) $(SYSTEM_H) \
 # genattrtab generates very long string literals.
 insn-attrtab.o-warn = -Wno-error
 
-ia64.o: debug.h $(PARAMS_H)
+ia64.o: debug.h $(PARAMS_H) sel-sched.h
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 28f3d7f..eae7d4f 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6390,6 +6390,29 @@ Deallocate internal data in target scheduling context pointed to by @var{tc}.
 Deallocate a store for target scheduling context pointed to by @var{tc}.
 @end deftypefn
 
+@deftypefn {Target Hook} void * TARGET_SCHED_ALLOC_SCHED_CONTEXT (void)
+Return a pointer to a store large enough to hold target scheduling context.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_SCHED_INIT_SCHED_CONTEXT (void *@var{tc}, bool @var{clean_p})
+Initialize store pointed to by @var{tc} to hold target scheduling context.
+It @var{clean_p} is true then initialize @var{tc} as if scheduler is at the
+beginning of the block.  Overwise, make a copy of the current context in
+@var{tc}.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_SCHED_SET_SCHED_CONTEXT (void *@var{tc})
+Copy target scheduling context pointer to by @var{tc} to the current context.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_SCHED_CLEAR_SCHED_CONTEXT (void *@var{tc})
+Deallocate internal data in target scheduling context pointed to by @var{tc}.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_SCHED_FREE_SCHED_CONTEXT (void *@var{tc})
+Deallocate a store for target scheduling context pointed to by @var{tc}.
+@end deftypefn
+
 @deftypefn {Target Hook} int TARGET_SCHED_SPECULATE_INSN (rtx @var{insn}, int @var{request}, rtx *@var{new_pat})
 This hook is called by the insn scheduler when @var{insn} has only
 speculative dependencies and therefore can be scheduled speculatively.
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 46640c4..09dc233 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -3949,7 +3949,7 @@ create_check_block_twin (rtx insn, bool mutate_p)
   todo_spec &= SPECULATIVE;
 
   /* Create recovery block.  */
-  if (mutate_p || targetm.sched.needs_block_p (insn))
+  if (mutate_p || targetm.sched.needs_block_p (todo_spec))
     {
       rec = sched_create_recovery_block (NULL);
       label = BB_HEAD (rec);
@@ -3961,7 +3961,7 @@ create_check_block_twin (rtx insn, bool mutate_p)
     }
 
   /* Emit CHECK.  */
-  check = targetm.sched.gen_spec_check (insn, label, mutate_p);
+  check = targetm.sched.gen_spec_check (insn, label, todo_spec);
 
   if (rec != EXIT_BLOCK_PTR)
     {
diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 498be32..9736e34 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -1817,7 +1817,7 @@ create_speculation_check (expr_t c_expr, ds_t check_ds, insn_t orig_insn)
 
   /* Create a recovery block if target is going to emit branchy check, or if
      ORIG_INSN was speculative already.  */
-  if (targetm.sched.needs_block_p (EXPR_INSN_RTX (c_expr))
+  if (targetm.sched.needs_block_p (check_ds)
       || EXPR_SPEC_DONE_DS (INSN_EXPR (orig_insn)) != 0)
     {
       recovery_block = sel_create_recovery_block (orig_insn);
diff --git a/gcc/target.h b/gcc/target.h
index a7d601f..21c46ad 100644
--- a/gcc/target.h
+++ b/gcc/target.h
@@ -397,7 +397,7 @@ struct gcc_target
     /* The following member value is a pointer to a function called
        by the insn scheduler.  It should return true if the check instruction
        passed as the parameter needs a recovery block.  */
-    bool (* needs_block_p) (const_rtx);
+    bool (* needs_block_p) (int);
 
     /* The following member value is a pointer to a function called
        by the insn scheduler.  It should return a pattern for the check
@@ -407,7 +407,7 @@ struct gcc_target
        simple check).  If the mutation of the check is requested (e.g. from
        ld.c to chk.a), the third parameter is true - in this case the first
        parameter is the previous check.  */
-    rtx (* gen_spec_check) (rtx, rtx, bool);
+    rtx (* gen_spec_check) (rtx, rtx, int);
 
     /* The following member value is a pointer to a function controlling
        what insns from the ready insn queue will be considered for the
author	Andrey Belevantsev <abel@ispras.ru>	2008-10-14 16:52:19 +0400
committer	Andrey Belevantsev <abel@gcc.gnu.org>	2008-10-14 16:52:19 +0400
commit	388092d5b419ad5573c6dcd1c6478691e8417087 (patch)
tree	a119d58d157037691fbfb1607a5ac9c87bc3ca7d
parent	847e63465e85f06389850301ba7632d4f0af6870 (diff)
download	gcc-388092d5b419ad5573c6dcd1c6478691e8417087.zip gcc-388092d5b419ad5573c6dcd1c6478691e8417087.tar.gz gcc-388092d5b419ad5573c6dcd1c6478691e8417087.tar.bz2