aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc/tree-ssa.texi
diff options
context:
space:
mode:
Diffstat (limited to 'gcc/doc/tree-ssa.texi')
-rw-r--r--gcc/doc/tree-ssa.texi722
1 files changed, 11 insertions, 711 deletions
diff --git a/gcc/doc/tree-ssa.texi b/gcc/doc/tree-ssa.texi
index ea3d0ab..bd0edc4 100644
--- a/gcc/doc/tree-ssa.texi
+++ b/gcc/doc/tree-ssa.texi
@@ -8,7 +8,7 @@
@c ---------------------------------------------------------------------
@node Tree SSA
-@chapter Analysis and Optimization of GIMPLE Trees
+@chapter Analysis and Optimization of GIMPLE tuples
@cindex Tree SSA
@cindex Optimization infrastructure for GIMPLE
@@ -37,728 +37,28 @@ functions and programming constructs needed to implement optimization
passes for GIMPLE@.
@menu
-* GENERIC:: A high-level language-independent representation.
-* GIMPLE:: A lower-level factored tree representation.
-* Annotations:: Attributes for statements and variables.
-* Statement Operands:: Variables referenced by GIMPLE statements.
+* Annotations:: Attributes for variables.
+* SSA Operands:: SSA names referenced by GIMPLE statements.
* SSA:: Static Single Assignment representation.
* Alias analysis:: Representing aliased loads and stores.
@end menu
-@node GENERIC
-@section GENERIC
-@cindex GENERIC
-
-The purpose of GENERIC is simply to provide a language-independent way of
-representing an entire function in trees. To this end, it was necessary to
-add a few new tree codes to the back end, but most everything was already
-there. If you can express it with the codes in @code{gcc/tree.def}, it's
-GENERIC@.
-
-Early on, there was a great deal of debate about how to think about
-statements in a tree IL@. In GENERIC, a statement is defined as any
-expression whose value, if any, is ignored. A statement will always
-have @code{TREE_SIDE_EFFECTS} set (or it will be discarded), but a
-non-statement expression may also have side effects. A
-@code{CALL_EXPR}, for instance.
-
-It would be possible for some local optimizations to work on the
-GENERIC form of a function; indeed, the adapted tree inliner works
-fine on GENERIC, but the current compiler performs inlining after
-lowering to GIMPLE (a restricted form described in the next section).
-Indeed, currently the frontends perform this lowering before handing
-off to @code{tree_rest_of_compilation}, but this seems inelegant.
-
-If necessary, a front end can use some language-dependent tree codes
-in its GENERIC representation, so long as it provides a hook for
-converting them to GIMPLE and doesn't expect them to work with any
-(hypothetical) optimizers that run before the conversion to GIMPLE@.
-The intermediate representation used while parsing C and C++ looks
-very little like GENERIC, but the C and C++ gimplifier hooks are
-perfectly happy to take it as input and spit out GIMPLE@.
-
-@node GIMPLE
-@section GIMPLE
-@cindex GIMPLE
-
-GIMPLE is a simplified subset of GENERIC for use in optimization. The
-particular subset chosen (and the name) was heavily influenced by the
-SIMPLE IL used by the McCAT compiler project at McGill University,
-though we have made some different choices. For one thing, SIMPLE
-doesn't support @code{goto}; a production compiler can't afford that
-kind of restriction.
-
-GIMPLE retains much of the structure of the parse trees: lexical
-scopes are represented as containers, rather than markers. However,
-expressions are broken down into a 3-address form, using temporary
-variables to hold intermediate values. Also, control structures are
-lowered to gotos.
-
-In GIMPLE no container node is ever used for its value; if a
-@code{COND_EXPR} or @code{BIND_EXPR} has a value, it is stored into a
-temporary within the controlled blocks, and that temporary is used in
-place of the container.
-
-The compiler pass which lowers GENERIC to GIMPLE is referred to as the
-@samp{gimplifier}. The gimplifier works recursively, replacing complex
-statements with sequences of simple statements.
-
-@c Currently, the only way to
-@c tell whether or not an expression is in GIMPLE form is by recursively
-@c examining it; in the future there will probably be a flag to help avoid
-@c redundant work. FIXME FIXME
-
-@menu
-* Interfaces::
-* Temporaries::
-* GIMPLE Expressions::
-* Statements::
-* GIMPLE Example::
-* Rough GIMPLE Grammar::
-@end menu
-
-@node Interfaces
-@subsection Interfaces
-@cindex gimplification
-
-The tree representation of a function is stored in
-@code{DECL_SAVED_TREE}. It is lowered to GIMPLE by a call to
-@code{gimplify_function_tree}.
-
-If a front end wants to include language-specific tree codes in the tree
-representation which it provides to the back end, it must provide a
-definition of @code{LANG_HOOKS_GIMPLIFY_EXPR} which knows how to
-convert the front end trees to GIMPLE@. Usually such a hook will involve
-much of the same code for expanding front end trees to RTL@. This function
-can return fully lowered GIMPLE, or it can return GENERIC trees and let the
-main gimplifier lower them the rest of the way; this is often simpler.
-GIMPLE that is not fully lowered is known as ``high GIMPLE'' and
-consists of the IL before the pass @code{pass_lower_cf}. High GIMPLE
-still contains lexical scopes and nested expressions, while low GIMPLE
-exposes all of the implicit jumps for control expressions like
-@code{COND_EXPR}.
-
-The C and C++ front ends currently convert directly from front end
-trees to GIMPLE, and hand that off to the back end rather than first
-converting to GENERIC@. Their gimplifier hooks know about all the
-@code{_STMT} nodes and how to convert them to GENERIC forms. There
-was some work done on a genericization pass which would run first, but
-the existence of @code{STMT_EXPR} meant that in order to convert all
-of the C statements into GENERIC equivalents would involve walking the
-entire tree anyway, so it was simpler to lower all the way. This
-might change in the future if someone writes an optimization pass
-which would work better with higher-level trees, but currently the
-optimizers all expect GIMPLE@.
-
-A front end which wants to use the tree optimizers (and already has
-some sort of whole-function tree representation) only needs to provide
-a definition of @code{LANG_HOOKS_GIMPLIFY_EXPR}, call
-@code{gimplify_function_tree} to lower to GIMPLE, and then hand off to
-@code{tree_rest_of_compilation} to compile and output the function.
-
-You can tell the compiler to dump a C-like representation of the GIMPLE
-form with the flag @option{-fdump-tree-gimple}.
-
-@node Temporaries
-@subsection Temporaries
-@cindex Temporaries
-
-When gimplification encounters a subexpression which is too complex, it
-creates a new temporary variable to hold the value of the subexpression,
-and adds a new statement to initialize it before the current statement.
-These special temporaries are known as @samp{expression temporaries}, and are
-allocated using @code{get_formal_tmp_var}. The compiler tries to
-always evaluate identical expressions into the same temporary, to simplify
-elimination of redundant calculations.
-
-We can only use expression temporaries when we know that it will not be
-reevaluated before its value is used, and that it will not be otherwise
-modified@footnote{These restrictions are derived from those in Morgan 4.8.}.
-Other temporaries can be allocated using
-@code{get_initialized_tmp_var} or @code{create_tmp_var}.
-
-Currently, an expression like @code{a = b + 5} is not reduced any
-further. We tried converting it to something like
-@smallexample
- T1 = b + 5;
- a = T1;
-@end smallexample
-but this bloated the representation for minimal benefit. However, a
-variable which must live in memory cannot appear in an expression; its
-value is explicitly loaded into a temporary first. Similarly, storing
-the value of an expression to a memory variable goes through a
-temporary.
-
-@node GIMPLE Expressions
-@subsection Expressions
-@cindex GIMPLE Expressions
-
-In general, expressions in GIMPLE consist of an operation and the
-appropriate number of simple operands; these operands must either be a
-GIMPLE rvalue (@code{is_gimple_val}), i.e.@: a constant or a register
-variable. More complex operands are factored out into temporaries, so
-that
-@smallexample
- a = b + c + d
-@end smallexample
-becomes
-@smallexample
- T1 = b + c;
- a = T1 + d;
-@end smallexample
-
-The same rule holds for arguments to a @code{CALL_EXPR}.
-
-The target of an assignment is usually a variable, but can also be an
-@code{INDIRECT_REF} or a compound lvalue as described below.
-
-@menu
-* Compound Expressions::
-* Compound Lvalues::
-* Conditional Expressions::
-* Logical Operators::
-@end menu
-
-@node Compound Expressions
-@subsubsection Compound Expressions
-@cindex Compound Expressions
-
-The left-hand side of a C comma expression is simply moved into a separate
-statement.
-
-@node Compound Lvalues
-@subsubsection Compound Lvalues
-@cindex Compound Lvalues
-
-Currently compound lvalues involving array and structure field references
-are not broken down; an expression like @code{a.b[2] = 42} is not reduced
-any further (though complex array subscripts are). This restriction is a
-workaround for limitations in later optimizers; if we were to convert this
-to
-
-@smallexample
- T1 = &a.b;
- T1[2] = 42;
-@end smallexample
-
-alias analysis would not remember that the reference to @code{T1[2]} came
-by way of @code{a.b}, so it would think that the assignment could alias
-another member of @code{a}; this broke @code{struct-alias-1.c}. Future
-optimizer improvements may make this limitation unnecessary.
-
-@node Conditional Expressions
-@subsubsection Conditional Expressions
-@cindex Conditional Expressions
-
-A C @code{?:} expression is converted into an @code{if} statement with
-each branch assigning to the same temporary. So,
-
-@smallexample
- a = b ? c : d;
-@end smallexample
-becomes
-@smallexample
- if (b)
- T1 = c;
- else
- T1 = d;
- a = T1;
-@end smallexample
-
-Tree level if-conversion pass re-introduces @code{?:} expression, if appropriate.
-It is used to vectorize loops with conditions using vector conditional operations.
-
-Note that in GIMPLE, @code{if} statements are also represented using
-@code{COND_EXPR}, as described below.
-
-@node Logical Operators
-@subsubsection Logical Operators
-@cindex Logical Operators
-
-Except when they appear in the condition operand of a @code{COND_EXPR},
-logical `and' and `or' operators are simplified as follows:
-@code{a = b && c} becomes
-
-@smallexample
- T1 = (bool)b;
- if (T1)
- T1 = (bool)c;
- a = T1;
-@end smallexample
-
-Note that @code{T1} in this example cannot be an expression temporary,
-because it has two different assignments.
-
-@node Statements
-@subsection Statements
-@cindex Statements
-
-Most statements will be assignment statements, represented by
-@code{MODIFY_EXPR}. A @code{CALL_EXPR} whose value is ignored can
-also be a statement. No other C expressions can appear at statement level;
-a reference to a volatile object is converted into a @code{MODIFY_EXPR}.
-In GIMPLE form, type of @code{MODIFY_EXPR} is not meaningful. Instead, use type
-of LHS or RHS@.
-
-There are also several varieties of complex statements.
-
-@menu
-* Blocks::
-* Statement Sequences::
-* Empty Statements::
-* Loops::
-* Selection Statements::
-* Jumps::
-* Cleanups::
-* GIMPLE Exception Handling::
-@end menu
-
-@node Blocks
-@subsubsection Blocks
-@cindex Blocks
-
-Block scopes and the variables they declare in GENERIC and GIMPLE are
-expressed using the @code{BIND_EXPR} code, which in previous versions of
-GCC was primarily used for the C statement-expression extension.
-
-Variables in a block are collected into @code{BIND_EXPR_VARS} in
-declaration order. Any runtime initialization is moved out of
-@code{DECL_INITIAL} and into a statement in the controlled block. When
-gimplifying from C or C++, this initialization replaces the
-@code{DECL_STMT}.
-
-Variable-length arrays (VLAs) complicate this process, as their size often
-refers to variables initialized earlier in the block. To handle this, we
-currently split the block at that point, and move the VLA into a new, inner
-@code{BIND_EXPR}. This strategy may change in the future.
-
-@code{DECL_SAVED_TREE} for a GIMPLE function will always be a
-@code{BIND_EXPR} which contains declarations for the temporary variables
-used in the function.
-
-A C++ program will usually contain more @code{BIND_EXPR}s than there are
-syntactic blocks in the source code, since several C++ constructs have
-implicit scopes associated with them. On the other hand, although the C++
-front end uses pseudo-scopes to handle cleanups for objects with
-destructors, these don't translate into the GIMPLE form; multiple
-declarations at the same level use the same @code{BIND_EXPR}.
-
-@node Statement Sequences
-@subsubsection Statement Sequences
-@cindex Statement Sequences
-
-Multiple statements at the same nesting level are collected into a
-@code{STATEMENT_LIST}. Statement lists are modified and traversed
-using the interface in @samp{tree-iterator.h}.
-
-@node Empty Statements
-@subsubsection Empty Statements
-@cindex Empty Statements
-
-Whenever possible, statements with no effect are discarded. But if they
-are nested within another construct which cannot be discarded for some
-reason, they are instead replaced with an empty statement, generated by
-@code{build_empty_stmt}. Initially, all empty statements were shared,
-after the pattern of the Java front end, but this caused a lot of trouble in
-practice.
-
-An empty statement is represented as @code{(void)0}.
-
-@node Loops
-@subsubsection Loops
-@cindex Loops
-
-At one time loops were expressed in GIMPLE using @code{LOOP_EXPR}, but
-now they are lowered to explicit gotos.
-
-@node Selection Statements
-@subsubsection Selection Statements
-@cindex Selection Statements
-
-A simple selection statement, such as the C @code{if} statement, is
-expressed in GIMPLE using a void @code{COND_EXPR}. If only one branch is
-used, the other is filled with an empty statement.
-
-Normally, the condition expression is reduced to a simple comparison. If
-it is a shortcut (@code{&&} or @code{||}) expression, however, we try to
-break up the @code{if} into multiple @code{if}s so that the implied shortcut
-is taken directly, much like the transformation done by @code{do_jump} in
-the RTL expander.
-
-A @code{SWITCH_EXPR} in GIMPLE contains the condition and a
-@code{TREE_VEC} of @code{CASE_LABEL_EXPR}s describing the case values
-and corresponding @code{LABEL_DECL}s to jump to. The body of the
-@code{switch} is moved after the @code{SWITCH_EXPR}.
-
-@node Jumps
-@subsubsection Jumps
-@cindex Jumps
-
-Other jumps are expressed by either @code{GOTO_EXPR} or @code{RETURN_EXPR}.
-
-The operand of a @code{GOTO_EXPR} must be either a label or a variable
-containing the address to jump to.
-
-The operand of a @code{RETURN_EXPR} is either @code{NULL_TREE},
-@code{RESULT_DECL}, or a @code{MODIFY_EXPR} which sets the return value. It
-would be nice to move the @code{MODIFY_EXPR} into a separate statement, but the
-special return semantics in @code{expand_return} make that difficult. It may
-still happen in the future, perhaps by moving most of that logic into
-@code{expand_assignment}.
-
-@node Cleanups
-@subsubsection Cleanups
-@cindex Cleanups
-
-Destructors for local C++ objects and similar dynamic cleanups are
-represented in GIMPLE by a @code{TRY_FINALLY_EXPR}.
-@code{TRY_FINALLY_EXPR} has two operands, both of which are a sequence
-of statements to execute. The first sequence is executed. When it
-completes the second sequence is executed.
-
-The first sequence may complete in the following ways:
-
-@enumerate
-
-@item Execute the last statement in the sequence and fall off the
-end.
-
-@item Execute a goto statement (@code{GOTO_EXPR}) to an ordinary
-label outside the sequence.
-
-@item Execute a return statement (@code{RETURN_EXPR}).
-
-@item Throw an exception. This is currently not explicitly represented in
-GIMPLE.
-
-@end enumerate
-
-The second sequence is not executed if the first sequence completes by
-calling @code{setjmp} or @code{exit} or any other function that does
-not return. The second sequence is also not executed if the first
-sequence completes via a non-local goto or a computed goto (in general
-the compiler does not know whether such a goto statement exits the
-first sequence or not, so we assume that it doesn't).
-
-After the second sequence is executed, if it completes normally by
-falling off the end, execution continues wherever the first sequence
-would have continued, by falling off the end, or doing a goto, etc.
-
-@code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup
-needs to appear on every edge out of the controlled block; this
-reduces the freedom to move code across these edges. Therefore, the
-EH lowering pass which runs before most of the optimization passes
-eliminates these expressions by explicitly adding the cleanup to each
-edge. Rethrowing the exception is represented using @code{RESX_EXPR}.
-
-
-@node GIMPLE Exception Handling
-@subsubsection Exception Handling
-@cindex GIMPLE Exception Handling
-
-Other exception handling constructs are represented using
-@code{TRY_CATCH_EXPR}. @code{TRY_CATCH_EXPR} has two operands. The
-first operand is a sequence of statements to execute. If executing
-these statements does not throw an exception, then the second operand
-is ignored. Otherwise, if an exception is thrown, then the second
-operand of the @code{TRY_CATCH_EXPR} is checked. The second operand
-may have the following forms:
-
-@enumerate
-
-@item A sequence of statements to execute. When an exception occurs,
-these statements are executed, and then the exception is rethrown.
-
-@item A sequence of @code{CATCH_EXPR} expressions. Each @code{CATCH_EXPR}
-has a list of applicable exception types and handler code. If the
-thrown exception matches one of the caught types, the associated
-handler code is executed. If the handler code falls off the bottom,
-execution continues after the original @code{TRY_CATCH_EXPR}.
-
-@item An @code{EH_FILTER_EXPR} expression. This has a list of
-permitted exception types, and code to handle a match failure. If the
-thrown exception does not match one of the allowed types, the
-associated match failure code is executed. If the thrown exception
-does match, it continues unwinding the stack looking for the next
-handler.
-
-@end enumerate
-
-Currently throwing an exception is not directly represented in GIMPLE,
-since it is implemented by calling a function. At some point in the future
-we will want to add some way to express that the call will throw an
-exception of a known type.
-
-Just before running the optimizers, the compiler lowers the high-level
-EH constructs above into a set of @samp{goto}s, magic labels, and EH
-regions. Continuing to unwind at the end of a cleanup is represented
-with a @code{RESX_EXPR}.
-
-@node GIMPLE Example
-@subsection GIMPLE Example
-@cindex GIMPLE Example
-
-@smallexample
-struct A @{ A(); ~A(); @};
-
-int i;
-int g();
-void f()
-@{
- A a;
- int j = (--i, i ? 0 : 1);
-
- for (int x = 42; x > 0; --x)
- @{
- i += g()*4 + 32;
- @}
-@}
-@end smallexample
-
-becomes
-
-@smallexample
-void f()
-@{
- int i.0;
- int T.1;
- int iftmp.2;
- int T.3;
- int T.4;
- int T.5;
- int T.6;
-
- @{
- struct A a;
- int j;
-
- __comp_ctor (&a);
- try
- @{
- i.0 = i;
- T.1 = i.0 - 1;
- i = T.1;
- i.0 = i;
- if (i.0 == 0)
- iftmp.2 = 1;
- else
- iftmp.2 = 0;
- j = iftmp.2;
- @{
- int x;
-
- x = 42;
- goto test;
- loop:;
-
- T.3 = g ();
- T.4 = T.3 * 4;
- i.0 = i;
- T.5 = T.4 + i.0;
- T.6 = T.5 + 32;
- i = T.6;
- x = x - 1;
-
- test:;
- if (x > 0)
- goto loop;
- else
- goto break_;
- break_:;
- @}
- @}
- finally
- @{
- __comp_dtor (&a);
- @}
- @}
-@}
-@end smallexample
-
-@node Rough GIMPLE Grammar
-@subsection Rough GIMPLE Grammar
-@cindex Rough GIMPLE Grammar
-
-@smallexample
- function : FUNCTION_DECL
- DECL_SAVED_TREE -> compound-stmt
-
- compound-stmt: STATEMENT_LIST
- members -> stmt
-
- stmt : block
- | if-stmt
- | switch-stmt
- | goto-stmt
- | return-stmt
- | resx-stmt
- | label-stmt
- | try-stmt
- | modify-stmt
- | call-stmt
-
- block : BIND_EXPR
- BIND_EXPR_VARS -> chain of DECLs
- BIND_EXPR_BLOCK -> BLOCK
- BIND_EXPR_BODY -> compound-stmt
-
- if-stmt : COND_EXPR
- op0 -> condition
- op1 -> compound-stmt
- op2 -> compound-stmt
-
- switch-stmt : SWITCH_EXPR
- op0 -> val
- op1 -> NULL
- op2 -> TREE_VEC of CASE_LABEL_EXPRs
- The CASE_LABEL_EXPRs are sorted by CASE_LOW,
- and default is last.
-
- goto-stmt : GOTO_EXPR
- op0 -> LABEL_DECL | val
-
- return-stmt : RETURN_EXPR
- op0 -> return-value
-
- return-value : NULL
- | RESULT_DECL
- | MODIFY_EXPR
- op0 -> RESULT_DECL
- op1 -> lhs
-
- resx-stmt : RESX_EXPR
-
- label-stmt : LABEL_EXPR
- op0 -> LABEL_DECL
-
- try-stmt : TRY_CATCH_EXPR
- op0 -> compound-stmt
- op1 -> handler
- | TRY_FINALLY_EXPR
- op0 -> compound-stmt
- op1 -> compound-stmt
-
- handler : catch-seq
- | EH_FILTER_EXPR
- | compound-stmt
-
- catch-seq : STATEMENT_LIST
- members -> CATCH_EXPR
-
- modify-stmt : MODIFY_EXPR
- op0 -> lhs
- op1 -> rhs
-
- call-stmt : CALL_EXPR
- op0 -> val | OBJ_TYPE_REF
- op1 -> call-arg-list
-
- call-arg-list: TREE_LIST
- members -> lhs | CONST
-
- addr-expr-arg: ID
- | compref
-
- addressable : addr-expr-arg
- | indirectref
-
- with-size-arg: addressable
- | call-stmt
-
- indirectref : INDIRECT_REF
- op0 -> val
-
- lhs : addressable
- | bitfieldref
- | WITH_SIZE_EXPR
- op0 -> with-size-arg
- op1 -> val
-
- min-lval : ID
- | indirectref
-
- bitfieldref : BIT_FIELD_REF
- op0 -> inner-compref
- op1 -> CONST
- op2 -> val
-
- compref : inner-compref
- | TARGET_MEM_REF
- op0 -> ID
- op1 -> val
- op2 -> val
- op3 -> CONST
- op4 -> CONST
- | REALPART_EXPR
- op0 -> inner-compref
- | IMAGPART_EXPR
- op0 -> inner-compref
-
- inner-compref: min-lval
- | COMPONENT_REF
- op0 -> inner-compref
- op1 -> FIELD_DECL
- op2 -> val
- | ARRAY_REF
- op0 -> inner-compref
- op1 -> val
- op2 -> val
- op3 -> val
- | ARRAY_RANGE_REF
- op0 -> inner-compref
- op1 -> val
- op2 -> val
- op3 -> val
- | VIEW_CONVERT_EXPR
- op0 -> inner-compref
-
- condition : val
- | RELOP
- op0 -> val
- op1 -> val
-
- val : ID
- | invariant ADDR_EXPR
- op0 -> addr-expr-arg
- | CONST
-
- rhs : lhs
- | CONST
- | call-stmt
- | ADDR_EXPR
- op0 -> addr-expr-arg
- | UNOP
- op0 -> val
- | BINOP
- op0 -> val
- op1 -> val
- | RELOP
- op0 -> val
- op1 -> val
- | COND_EXPR
- op0 -> condition
- op1 -> val
- op2 -> val
-@end smallexample
-
@node Annotations
@section Annotations
@cindex annotations
-The optimizers need to associate attributes with statements and
-variables during the optimization process. For instance, we need to
-know what basic block a statement belongs to or whether a variable
-has aliases. All these attributes are stored in data structures
-called annotations which are then linked to the field @code{ann} in
-@code{struct tree_common}.
+The optimizers need to associate attributes with variables during the
+optimization process. For instance, we need to know whether a
+variable has aliases. All these attributes are stored in data
+structures called annotations which are then linked to the field
+@code{ann} in @code{struct tree_common}.
-Presently, we define annotations for statements (@code{stmt_ann_t}),
-variables (@code{var_ann_t}) and SSA names (@code{ssa_name_ann_t}).
+Presently, we define annotations for variables (@code{var_ann_t}).
Annotations are defined and documented in @file{tree-flow.h}.
-@node Statement Operands
-@section Statement Operands
+@node SSA Operands
+@section SSA Operands
@cindex operands
@cindex virtual operands
@cindex real operands