aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorRichard Henderson <rth@cygnus.com>1999-10-02 11:07:49 -0700
committerRichard Henderson <rth@gcc.gnu.org>1999-10-02 11:07:49 -0700
commitf3a3d0d39fab02f3b8d2206a5f46e5eb2f8e9b05 (patch)
treeaed41d502772f3f65f5bb0758a246d5a4b1baba9
parentffab8d8591430b98b70001d968784f7014ecd1b9 (diff)
downloadgcc-f3a3d0d39fab02f3b8d2206a5f46e5eb2f8e9b05.zip
gcc-f3a3d0d39fab02f3b8d2206a5f46e5eb2f8e9b05.tar.gz
gcc-f3a3d0d39fab02f3b8d2206a5f46e5eb2f8e9b05.tar.bz2
* md.texi (define_peephole2): New section.
From-SVN: r29772
-rw-r--r--gcc/ChangeLog4
-rw-r--r--gcc/md.texi527
2 files changed, 319 insertions, 212 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 901ac95..9af4018 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+Sat Oct 2 11:06:31 1999 Richard Henderson <rth@cygnus.com>
+
+ * md.texi (define_peephole2): New section.
+
Sat Oct 2 10:57:56 1999 Jan Hubicka <hubicka@freesoft.cz>
* i386.md (mov?i patterns): Fix handling of TARGET_USE_MOV0
diff --git a/gcc/md.texi b/gcc/md.texi
index bb7aeac..8242ea0 100644
--- a/gcc/md.texi
+++ b/gcc/md.texi
@@ -32,10 +32,10 @@ See the next chapter for information on the C header file.
* Dependent Patterns:: Having one pattern may make you need another.
* Jump Patterns:: Special considerations for patterns for jump insns.
* Insn Canonicalizations::Canonicalization of Instructions
-* Peephole Definitions::Defining machine-specific peephole optimizations.
* Expander Definitions::Generating a sequence of several RTL insns
- for a standard operation.
-* Insn Splitting:: Splitting Instructions into Multiple Instructions
+ for a standard operation.
+* Insn Splitting:: Splitting Instructions into Multiple Instructions.
+* Peephole Definitions::Defining machine-specific peephole optimizations.
* Insn Attributes:: Specifying the value of attributes for generated insns.
@end menu
@@ -2907,210 +2907,6 @@ will be written using @code{zero_extract} rather than the equivalent
@end itemize
-@node Peephole Definitions
-@section Machine-Specific Peephole Optimizers
-@cindex peephole optimizer definitions
-@cindex defining peephole optimizers
-
-In addition to instruction patterns the @file{md} file may contain
-definitions of machine-specific peephole optimizations.
-
-The combiner does not notice certain peephole optimizations when the data
-flow in the program does not suggest that it should try them. For example,
-sometimes two consecutive insns related in purpose can be combined even
-though the second one does not appear to use a register computed in the
-first one. A machine-specific peephole optimizer can detect such
-opportunities.
-
-@need 1000
-A definition looks like this:
-
-@smallexample
-(define_peephole
- [@var{insn-pattern-1}
- @var{insn-pattern-2}
- @dots{}]
- "@var{condition}"
- "@var{template}"
- "@var{optional insn-attributes}")
-@end smallexample
-
-@noindent
-The last string operand may be omitted if you are not using any
-machine-specific information in this machine description. If present,
-it must obey the same rules as in a @code{define_insn}.
-
-In this skeleton, @var{insn-pattern-1} and so on are patterns to match
-consecutive insns. The optimization applies to a sequence of insns when
-@var{insn-pattern-1} matches the first one, @var{insn-pattern-2} matches
-the next, and so on.@refill
-
-Each of the insns matched by a peephole must also match a
-@code{define_insn}. Peepholes are checked only at the last stage just
-before code generation, and only optionally. Therefore, any insn which
-would match a peephole but no @code{define_insn} will cause a crash in code
-generation in an unoptimized compilation, or at various optimization
-stages.
-
-The operands of the insns are matched with @code{match_operands},
-@code{match_operator}, and @code{match_dup}, as usual. What is not
-usual is that the operand numbers apply to all the insn patterns in the
-definition. So, you can check for identical operands in two insns by
-using @code{match_operand} in one insn and @code{match_dup} in the
-other.
-
-The operand constraints used in @code{match_operand} patterns do not have
-any direct effect on the applicability of the peephole, but they will
-be validated afterward, so make sure your constraints are general enough
-to apply whenever the peephole matches. If the peephole matches
-but the constraints are not satisfied, the compiler will crash.
-
-It is safe to omit constraints in all the operands of the peephole; or
-you can write constraints which serve as a double-check on the criteria
-previously tested.
-
-Once a sequence of insns matches the patterns, the @var{condition} is
-checked. This is a C expression which makes the final decision whether to
-perform the optimization (we do so if the expression is nonzero). If
-@var{condition} is omitted (in other words, the string is empty) then the
-optimization is applied to every sequence of insns that matches the
-patterns.
-
-The defined peephole optimizations are applied after register allocation
-is complete. Therefore, the peephole definition can check which
-operands have ended up in which kinds of registers, just by looking at
-the operands.
-
-@findex prev_active_insn
-The way to refer to the operands in @var{condition} is to write
-@code{operands[@var{i}]} for operand number @var{i} (as matched by
-@code{(match_operand @var{i} @dots{})}). Use the variable @code{insn}
-to refer to the last of the insns being matched; use
-@code{prev_active_insn} to find the preceding insns.
-
-@findex dead_or_set_p
-When optimizing computations with intermediate results, you can use
-@var{condition} to match only when the intermediate results are not used
-elsewhere. Use the C expression @code{dead_or_set_p (@var{insn},
-@var{op})}, where @var{insn} is the insn in which you expect the value
-to be used for the last time (from the value of @code{insn}, together
-with use of @code{prev_nonnote_insn}), and @var{op} is the intermediate
-value (from @code{operands[@var{i}]}).@refill
-
-Applying the optimization means replacing the sequence of insns with one
-new insn. The @var{template} controls ultimate output of assembler code
-for this combined insn. It works exactly like the template of a
-@code{define_insn}. Operand numbers in this template are the same ones
-used in matching the original sequence of insns.
-
-The result of a defined peephole optimizer does not need to match any of
-the insn patterns in the machine description; it does not even have an
-opportunity to match them. The peephole optimizer definition itself serves
-as the insn pattern to control how the insn is output.
-
-Defined peephole optimizers are run as assembler code is being output,
-so the insns they produce are never combined or rearranged in any way.
-
-Here is an example, taken from the 68000 machine description:
-
-@smallexample
-(define_peephole
- [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4)))
- (set (match_operand:DF 0 "register_operand" "=f")
- (match_operand:DF 1 "register_operand" "ad"))]
- "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
- "*
-@{
- rtx xoperands[2];
- xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1);
-#ifdef MOTOROLA
- output_asm_insn (\"move.l %1,(sp)\", xoperands);
- output_asm_insn (\"move.l %1,-(sp)\", operands);
- return \"fmove.d (sp)+,%0\";
-#else
- output_asm_insn (\"movel %1,sp@@\", xoperands);
- output_asm_insn (\"movel %1,sp@@-\", operands);
- return \"fmoved sp@@+,%0\";
-#endif
-@}
-")
-@end smallexample
-
-@need 1000
-The effect of this optimization is to change
-
-@smallexample
-@group
-jbsr _foobar
-addql #4,sp
-movel d1,sp@@-
-movel d0,sp@@-
-fmoved sp@@+,fp0
-@end group
-@end smallexample
-
-@noindent
-into
-
-@smallexample
-@group
-jbsr _foobar
-movel d1,sp@@
-movel d0,sp@@-
-fmoved sp@@+,fp0
-@end group
-@end smallexample
-
-@ignore
-@findex CC_REVERSED
-If a peephole matches a sequence including one or more jump insns, you must
-take account of the flags such as @code{CC_REVERSED} which specify that the
-condition codes are represented in an unusual manner. The compiler
-automatically alters any ordinary conditional jumps which occur in such
-situations, but the compiler cannot alter jumps which have been replaced by
-peephole optimizations. So it is up to you to alter the assembler code
-that the peephole produces. Supply C code to write the assembler output,
-and in this C code check the condition code status flags and change the
-assembler code as appropriate.
-@end ignore
-
-@var{insn-pattern-1} and so on look @emph{almost} like the second
-operand of @code{define_insn}. There is one important difference: the
-second operand of @code{define_insn} consists of one or more RTX's
-enclosed in square brackets. Usually, there is only one: then the same
-action can be written as an element of a @code{define_peephole}. But
-when there are multiple actions in a @code{define_insn}, they are
-implicitly enclosed in a @code{parallel}. Then you must explicitly
-write the @code{parallel}, and the square brackets within it, in the
-@code{define_peephole}. Thus, if an insn pattern looks like this,
-
-@smallexample
-(define_insn "divmodsi4"
- [(set (match_operand:SI 0 "general_operand" "=d")
- (div:SI (match_operand:SI 1 "general_operand" "0")
- (match_operand:SI 2 "general_operand" "dmsK")))
- (set (match_operand:SI 3 "general_operand" "=d")
- (mod:SI (match_dup 1) (match_dup 2)))]
- "TARGET_68020"
- "divsl%.l %2,%3:%0")
-@end smallexample
-
-@noindent
-then the way to mention this insn in a peephole is as follows:
-
-@smallexample
-(define_peephole
- [@dots{}
- (parallel
- [(set (match_operand:SI 0 "general_operand" "=d")
- (div:SI (match_operand:SI 1 "general_operand" "0")
- (match_operand:SI 2 "general_operand" "dmsK")))
- (set (match_operand:SI 3 "general_operand" "=d")
- (mod:SI (match_dup 1) (match_dup 2)))])
- @dots{}]
- @dots{})
-@end smallexample
-
@node Expander Definitions
@section Defining RTL Sequences for Code Generation
@cindex expander definitions
@@ -3134,11 +2930,10 @@ A @code{define_expand} RTX has four operands:
The name. Each @code{define_expand} must have a name, since the only
use for it is to refer to it by name.
-@findex define_peephole
@item
-The RTL template. This is just like the RTL template for a
-@code{define_peephole} in that it is a vector of RTL expressions
-each being one insn.
+The RTL template. This is a vector of RTL expressions representing
+a sequence of separate instructions. Unlike @code{define_insn}, there
+is no implicit surrounding @code{PARALLEL}.
@item
The condition, a string containing a C expression. This expression is
@@ -3333,7 +3128,7 @@ subexpression. However, in some other cases, such as performing an
addition of a large constant in two insns on a RISC machine, the way to
split the addition into two insns is machine-dependent.
-@cindex define_split
+@findex define_split
The @code{define_split} definition tells the compiler how to split a
complex insn into several simpler insns. It looks like this:
@@ -3466,6 +3261,314 @@ insns that don't. Instead, write two separate @code{define_split}
definitions, one for the insns that are valid and one for the insns that
are not valid.
+@node Peephole Definitions
+@section Machine-Specific Peephole Optimizers
+@cindex peephole optimizer definitions
+@cindex defining peephole optimizers
+
+In addition to instruction patterns the @file{md} file may contain
+definitions of machine-specific peephole optimizations.
+
+The combiner does not notice certain peephole optimizations when the data
+flow in the program does not suggest that it should try them. For example,
+sometimes two consecutive insns related in purpose can be combined even
+though the second one does not appear to use a register computed in the
+first one. A machine-specific peephole optimizer can detect such
+opportunities.
+
+There are two forms of peephole definitions that may be used. The
+original @code{define_peephole} is run at assembly output time to
+match insns and substitute assembly text. Use of @code{define_peephole}
+is deprecated.
+
+A newer @code{define_peephole2} matches insns and substitutes new
+insns. The @code{peephole2} pass is run after register allocation
+but before scheduling, which may result in much better code for
+targets that do scheduling.
+
+@menu
+* define_peephole:: RTL to Text Peephole Optimizers
+* define_peephole2:: RTL to RTL Peephole Optimizers
+@end menu
+
+@node define_peephole
+@subsection RTL to Text Peephole Optimizers
+@findex define_peephole
+
+@need 1000
+A definition looks like this:
+
+@smallexample
+(define_peephole
+ [@var{insn-pattern-1}
+ @var{insn-pattern-2}
+ @dots{}]
+ "@var{condition}"
+ "@var{template}"
+ "@var{optional insn-attributes}")
+@end smallexample
+
+@noindent
+The last string operand may be omitted if you are not using any
+machine-specific information in this machine description. If present,
+it must obey the same rules as in a @code{define_insn}.
+
+In this skeleton, @var{insn-pattern-1} and so on are patterns to match
+consecutive insns. The optimization applies to a sequence of insns when
+@var{insn-pattern-1} matches the first one, @var{insn-pattern-2} matches
+the next, and so on.@refill
+
+Each of the insns matched by a peephole must also match a
+@code{define_insn}. Peepholes are checked only at the last stage just
+before code generation, and only optionally. Therefore, any insn which
+would match a peephole but no @code{define_insn} will cause a crash in code
+generation in an unoptimized compilation, or at various optimization
+stages.
+
+The operands of the insns are matched with @code{match_operands},
+@code{match_operator}, and @code{match_dup}, as usual. What is not
+usual is that the operand numbers apply to all the insn patterns in the
+definition. So, you can check for identical operands in two insns by
+using @code{match_operand} in one insn and @code{match_dup} in the
+other.
+
+The operand constraints used in @code{match_operand} patterns do not have
+any direct effect on the applicability of the peephole, but they will
+be validated afterward, so make sure your constraints are general enough
+to apply whenever the peephole matches. If the peephole matches
+but the constraints are not satisfied, the compiler will crash.
+
+It is safe to omit constraints in all the operands of the peephole; or
+you can write constraints which serve as a double-check on the criteria
+previously tested.
+
+Once a sequence of insns matches the patterns, the @var{condition} is
+checked. This is a C expression which makes the final decision whether to
+perform the optimization (we do so if the expression is nonzero). If
+@var{condition} is omitted (in other words, the string is empty) then the
+optimization is applied to every sequence of insns that matches the
+patterns.
+
+The defined peephole optimizations are applied after register allocation
+is complete. Therefore, the peephole definition can check which
+operands have ended up in which kinds of registers, just by looking at
+the operands.
+
+@findex prev_active_insn
+The way to refer to the operands in @var{condition} is to write
+@code{operands[@var{i}]} for operand number @var{i} (as matched by
+@code{(match_operand @var{i} @dots{})}). Use the variable @code{insn}
+to refer to the last of the insns being matched; use
+@code{prev_active_insn} to find the preceding insns.
+
+@findex dead_or_set_p
+When optimizing computations with intermediate results, you can use
+@var{condition} to match only when the intermediate results are not used
+elsewhere. Use the C expression @code{dead_or_set_p (@var{insn},
+@var{op})}, where @var{insn} is the insn in which you expect the value
+to be used for the last time (from the value of @code{insn}, together
+with use of @code{prev_nonnote_insn}), and @var{op} is the intermediate
+value (from @code{operands[@var{i}]}).@refill
+
+Applying the optimization means replacing the sequence of insns with one
+new insn. The @var{template} controls ultimate output of assembler code
+for this combined insn. It works exactly like the template of a
+@code{define_insn}. Operand numbers in this template are the same ones
+used in matching the original sequence of insns.
+
+The result of a defined peephole optimizer does not need to match any of
+the insn patterns in the machine description; it does not even have an
+opportunity to match them. The peephole optimizer definition itself serves
+as the insn pattern to control how the insn is output.
+
+Defined peephole optimizers are run as assembler code is being output,
+so the insns they produce are never combined or rearranged in any way.
+
+Here is an example, taken from the 68000 machine description:
+
+@smallexample
+(define_peephole
+ [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4)))
+ (set (match_operand:DF 0 "register_operand" "=f")
+ (match_operand:DF 1 "register_operand" "ad"))]
+ "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
+ "*
+@{
+ rtx xoperands[2];
+ xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1);
+#ifdef MOTOROLA
+ output_asm_insn (\"move.l %1,(sp)\", xoperands);
+ output_asm_insn (\"move.l %1,-(sp)\", operands);
+ return \"fmove.d (sp)+,%0\";
+#else
+ output_asm_insn (\"movel %1,sp@@\", xoperands);
+ output_asm_insn (\"movel %1,sp@@-\", operands);
+ return \"fmoved sp@@+,%0\";
+#endif
+@}
+")
+@end smallexample
+
+@need 1000
+The effect of this optimization is to change
+
+@smallexample
+@group
+jbsr _foobar
+addql #4,sp
+movel d1,sp@@-
+movel d0,sp@@-
+fmoved sp@@+,fp0
+@end group
+@end smallexample
+
+@noindent
+into
+
+@smallexample
+@group
+jbsr _foobar
+movel d1,sp@@
+movel d0,sp@@-
+fmoved sp@@+,fp0
+@end group
+@end smallexample
+
+@ignore
+@findex CC_REVERSED
+If a peephole matches a sequence including one or more jump insns, you must
+take account of the flags such as @code{CC_REVERSED} which specify that the
+condition codes are represented in an unusual manner. The compiler
+automatically alters any ordinary conditional jumps which occur in such
+situations, but the compiler cannot alter jumps which have been replaced by
+peephole optimizations. So it is up to you to alter the assembler code
+that the peephole produces. Supply C code to write the assembler output,
+and in this C code check the condition code status flags and change the
+assembler code as appropriate.
+@end ignore
+
+@var{insn-pattern-1} and so on look @emph{almost} like the second
+operand of @code{define_insn}. There is one important difference: the
+second operand of @code{define_insn} consists of one or more RTX's
+enclosed in square brackets. Usually, there is only one: then the same
+action can be written as an element of a @code{define_peephole}. But
+when there are multiple actions in a @code{define_insn}, they are
+implicitly enclosed in a @code{parallel}. Then you must explicitly
+write the @code{parallel}, and the square brackets within it, in the
+@code{define_peephole}. Thus, if an insn pattern looks like this,
+
+@smallexample
+(define_insn "divmodsi4"
+ [(set (match_operand:SI 0 "general_operand" "=d")
+ (div:SI (match_operand:SI 1 "general_operand" "0")
+ (match_operand:SI 2 "general_operand" "dmsK")))
+ (set (match_operand:SI 3 "general_operand" "=d")
+ (mod:SI (match_dup 1) (match_dup 2)))]
+ "TARGET_68020"
+ "divsl%.l %2,%3:%0")
+@end smallexample
+
+@noindent
+then the way to mention this insn in a peephole is as follows:
+
+@smallexample
+(define_peephole
+ [@dots{}
+ (parallel
+ [(set (match_operand:SI 0 "general_operand" "=d")
+ (div:SI (match_operand:SI 1 "general_operand" "0")
+ (match_operand:SI 2 "general_operand" "dmsK")))
+ (set (match_operand:SI 3 "general_operand" "=d")
+ (mod:SI (match_dup 1) (match_dup 2)))])
+ @dots{}]
+ @dots{})
+@end smallexample
+
+@node define_peephole2
+@subsection RTL to RTL Peephole Optimizers
+@findex define_peephole2
+
+The @code{define_peephole2} definition tells the compiler how to
+substitute one sequence of instructions for another sequence,
+what additional scratch registers may be needed and what their
+lifetimes must be.
+
+@smallexample
+(define_peephole2
+ [@var{insn-pattern-1}
+ @var{insn-pattern-2}
+ @dots{}]
+ "@var{condition}"
+ [@var{new-insn-pattern-1}
+ @var{new-insn-pattern-2}
+ @dots{}]
+ "@var{preparation statements}")
+@end smallexample
+
+The definition is almost identical to @code{define_split}
+(@pxref{Insn Splitting}) except that the pattern to match is not a
+single instruction, but a sequence of instructions.
+
+It is possible to request additional scratch registers for use in the
+output template. If appropriate registers are not free, the pattern
+will simply not match.
+
+@findex match_scratch
+@findex match_dup
+Scratch registers are requested with a @code{match_scratch} pattern at
+the top level of the input pattern. The allocated register (initially) will
+be dead at the point requested within the original sequence. If the scratch
+is used at more than a single point, a @code{match_dup} pattern at the
+top level of the input pattern marks the last position in the input sequence
+at which the register must be available.
+
+Here is an example from the IA-32 machine description:
+
+@smallexample
+(define_peephole2
+ [(match_scratch:SI 2 "r")
+ (parallel [(set (match_operand:SI 0 "register_operand" "")
+ (match_operator:SI 3 "arith_or_logical_operator"
+ [(match_dup 0)
+ (match_operand:SI 1 "memory_operand" "")]))
+ (clobber (reg:CC 17))])]
+ "! optimize_size && ! TARGET_READ_MODIFY"
+ [(set (match_dup 2) (match_dup 1))
+ (parallel [(set (match_dup 0)
+ (match_op_dup 3 [(match_dup 0) (match_dup 2)]))
+ (clobber (reg:CC 17))])]
+ "")
+@end smallexample
+
+@noindent
+This pattern tries to split a load from its use in the hopes that we'll be
+able to schedule around the memory load latency. It allocates a single
+@code{SImode} register of class @code{GENERAL_REGS} (@code{"r"}) that needs
+to be live only at the point just before the arithmetic.
+
+A real example requring extended scratch lifetimes is harder to come by,
+so here's a silly made-up example:
+
+@smallexample
+(define_peephole2
+ [(match_scratch:SI 4 "r")
+ (set (match_operand:SI 0 "" "") (match_operand:SI 1 "" ""))
+ (set (match_operand:SI 2 "" "") (match_dup 1))
+ (match_dup 4)
+ (set (match_operand:SI 3 "" "") (match_dup 1))]
+ "@var{determine 1 does not overlap 0 and 2}"
+ [(set (match_dup 4) (match_dup 1))
+ (set (match_dup 0) (match_dup 4))
+ (set (match_dup 2) (match_dup 4))]
+ (set (match_dup 3) (match_dup 4))]
+ "")
+@end smallexample
+
+@noindent
+If we had not added the @code{(match_dup 3)} at the end of the sequence,
+it might have been the case that the register we chose at the beginning
+of the sequence is killed by the first or second @code{set}.
+
@node Insn Attributes
@section Instruction Attributes
@cindex insn attributes