aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc/gty.texi
blob: 2dc5b86dce3a467b0f9807d1a3617ec210f0b3e8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
@c Copyright (C) 2002, 2003, 2004
@c Free Software Foundation, Inc.
@c This is part of the GCC manual.
@c For copying conditions, see the file gcc.texi.

@node Type Information
@chapter Memory Management and Type Information
@cindex GGC
@findex GTY

GCC uses some fairly sophisticated memory management techniques, which
involve determining information about GCC's data structures from GCC's
source code and using this information to perform garbage collection and
implement precompiled headers.

A full C parser would be too overcomplicated for this task, so a limited
subset of C is interpreted and special markers are used to determine
what parts of the source to look at.  The parser can also detect
simple typedefs of the form @code{typedef struct ID1 *ID2;} and
@code{typedef int ID3;}, and these don't need to be specially marked.

The two forms that do need to be marked are:
@verbatim
struct ID1 GTY(([options]))
{
  [fields]
};

typedef struct ID2 GTY(([options]))
{
  [fields]
} ID3;
@end verbatim

@menu
* GTY Options::		What goes inside a @code{GTY(())}.
* GGC Roots::		Making global variables GGC roots.
* Files::		How the generated files work.
@end menu

@node GTY Options
@section The Inside of a @code{GTY(())}

Sometimes the C code is not enough to fully describe the type structure.
Extra information can be provided by using more @code{GTY} markers.
These markers can be placed:
@itemize @bullet
@item
In a structure definition, before the open brace;
@item
In a global variable declaration, after the keyword @code{static} or
@code{extern}; and
@item
In a structure field definition, before the name of the field.
@end itemize

The format of a marker is
@verbatim
GTY (([name] ([param]), [name] ([param]) ...))
@end verbatim
The parameter is either a string or a type name.

When the parameter is a string, often it is a fragment of C code.  Three
special escapes may be available:

@cindex % in GTY option
@table @code
@item %h
This expands to an expression that evaluates to the current structure.
@item %1
This expands to an expression that evaluates to the structure that
immediately contains the current structure.
@item %0
This expands to an expression that evaluates to the outermost structure
that contains the current structure.
@item %a
This expands to the string of the form @code{[i1][i2]...} that indexes
the array item currently being marked.  For instance, if the field
being marked is @code{foo}, then @code{%1.foo%a} is the same as @code{%h}.
@end table

The available options are:

@table @code
@findex length
@item length

There are two places the type machinery will need to be explicitly told
the length of an array.  The first case is when a structure ends in a
variable-length array, like this:
@verbatim
struct rtvec_def GTY(()) {
  int num_elem;		/* number of elements */
  rtx GTY ((length ("%h.num_elem"))) elem[1];
};
@end verbatim
In this case, the @code{length} option is used to override the specified
array length (which should usually be @code{1}).  The parameter of the
option is a fragment of C code that calculates the length.

The second case is when a structure or a global variable contains a
pointer to an array, like this:
@smallexample
tree *
  GTY ((length ("%h.regno_pointer_align_length"))) regno_decl;
@end smallexample
In this case, @code{regno_decl} has been allocated by writing something like
@smallexample
  x->regno_decl =
    ggc_alloc (x->regno_pointer_align_length * sizeof (tree));
@end smallexample
and the @code{length} provides the length of the field.

This second use of @code{length} also works on global variables, like:
@verbatim
  static GTY((length ("reg_base_value_size")))
    rtx *reg_base_value;
@end verbatim

@findex skip
@item skip

If @code{skip} is applied to a field, the type machinery will ignore it.
This is somewhat dangerous; the only safe use is in a union when one
field really isn't ever used.

@findex desc
@findex tag
@findex default
@item desc
@itemx tag
@itemx default

The type machinery needs to be told which field of a @code{union} is
currently active.  This is done by giving each field a constant
@code{tag} value, and then specifying a discriminator using @code{desc}.
The value of the expression given by @code{desc} is compared against
each @code{tag} value, each of which should be different.  If no
@code{tag} is matched, the field marked with @code{default} is used if
there is one, otherwise no field in the union will be marked.

In the @code{desc} option, the ``current structure'' is the union that
it discriminates.  Use @code{%1} to mean the structure containing it.
(There are no escapes available to the @code{tag} option, since it's
supposed to be a constant.)

For example,
@smallexample
struct tree_binding GTY(())
@{
  struct tree_common common;
  union tree_binding_u @{
    tree GTY ((tag ("0"))) scope;
    struct cp_binding_level * GTY ((tag ("1"))) level;
  @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope;
  tree value;
@};
@end smallexample

In this example, the value of BINDING_HAS_LEVEL_P when applied to a
@code{struct tree_binding *} is presumed to be 0 or 1.  If 1, the type
mechanism will treat the field @code{level} as being present and if 0,
will treat the field @code{scope} as being present.

@findex param_is
@findex use_param
@item param_is
@itemx use_param

Sometimes it's convenient to define some data structure to work on
generic pointers (that is, @code{PTR}) and then use it with a specific
type.  @code{param_is} specifies the real type pointed to, and
@code{use_param} says where in the generic data structure that type
should be put.

For instance, to have a @code{htab_t} that points to trees, one should write
@verbatim
  htab_t GTY ((param_is (union tree_node))) ict;
@end verbatim

@findex param@var{n}_is
@findex use_param@var{n}
@item param@var{n}_is
@itemx use_param@var{n}

In more complicated cases, the data structure might need to work on
several different types, which might not necessarily all be pointers.
For this, @code{param1_is} through @code{param9_is} may be used to
specify the real type of a field identified by @code{use_param1} through
@code{use_param9}.

@findex use_params
@item use_params

When a structure contains another structure that is parameterized,
there's no need to do anything special, the inner structure inherits the
parameters of the outer one.  When a structure contains a pointer to a
parameterized structure, the type machinery won't automatically detect
this (it could, it just doesn't yet), so it's necessary to tell it that
the pointed-to structure should use the same parameters as the outer
structure.  This is done by marking the pointer with the
@code{use_params} option.

@findex deletable
@item deletable

@code{deletable}, when applied to a global variable, indicates that when
garbage collection runs, there's no need to mark anything pointed to
by this variable, it can just be set to @code{NULL} instead.  This is used
to keep a list of free structures around for re-use.

@findex if_marked
@item if_marked

Suppose you want some kinds of object to be unique, and so you put them
in a hash table.  If garbage collection marks the hash table, these
objects will never be freed, even if the last other reference to them
goes away.  GGC has special handling to deal with this: if you use the
@code{if_marked} option on a global hash table, GGC will call the
routine whose name is the parameter to the option on each hash table
entry.  If the routine returns nonzero, the hash table entry will
be marked as usual.  If the routine returns zero, the hash table entry
will be deleted.

The routine @code{ggc_marked_p} can be used to determine if an element
has been marked already; in fact, the usual case is to use
@code{if_marked ("ggc_marked_p")}.

@findex maybe_undef
@item maybe_undef

When applied to a field, @code{maybe_undef} indicates that it's OK if
the structure that this fields points to is never defined, so long as
this field is always @code{NULL}.  This is used to avoid requiring
backends to define certain optional structures.  It doesn't work with
language frontends.

@findex chain_next
@findex chain_prev
@item chain_next
@itemx chain_prev

It's helpful for the type machinery to know if objects are often
chained together in long lists; this lets it generate code that uses
less stack space by iterating along the list instead of recursing down
it.  @code{chain_next} is an expression for the next item in the list,
@code{chain_prev} is an expression for the previous item.  The
machinery requires that taking the next item of the previous item
gives the original item.

@findex reorder
@item reorder

Some data structures depend on the relative ordering of pointers.  If
the type machinery needs to change that ordering, it will call the
function referenced by the @code{reorder} option, before changing the
pointers in the object that's pointed to by the field the option
applies to.  The function must be of the type @code{void ()(void *,
void *, gt_pointer_operator, void *)}.  The second parameter is the
pointed-to object; the third parameter is a routine that, given a
pointer, can update it to its new value.  The fourth parameter is a
cookie to be passed to the third parameter.  The first parameter is
the structure that contains the object, or the object itself if it is
a structure.

No data structure may depend on the absolute value of pointers.  Even
relying on relative orderings and using @code{reorder} functions can
be expensive.  It is better to depend on properties of the data, like
an ID number or the hash of a string instead.

@findex special
@item special

The @code{special} option is used for those bizarre cases that are just
too hard to deal with otherwise.  Don't use it for new code.

@end table

@node GGC Roots
@section Marking Roots for the Garbage Collector
@cindex roots, marking
@cindex marking roots

In addition to keeping track of types, the type machinery also locates
the global variables that the garbage collector starts at.  There are
two syntaxes it accepts to indicate a root:

@enumerate
@item
@verb{|extern GTY (([options])) [type] ID;|}
@item
@verb{|static GTY (([options])) [type] ID;|}
@end enumerate

These are the only syntaxes that are accepted.  In particular, if you
want to mark a variable that is only declared as
@verbatim
int ID;
@end verbatim
or similar, you should either make it @code{static} or you should create
a @code{extern} declaration in a header file somewhere.

@node Files
@section Source Files Containing Type Information
@cindex generated files
@cindex files, generated

Whenever you add @code{GTY} markers to a new source file, there are three
things you need to do:

@enumerate
@item
You need to add the file to the list of source files the type
machinery scans.  There are three cases:

@enumerate a
@item
For a back-end file, this is usually done
automatically; if not, you should add it to @code{target_gtfiles} in
the appropriate port's entries in @file{config.gcc}.

@item
For files shared by all front ends, this is done by adding the
filename to the @code{GTFILES} variable in @file{Makefile.in}.

@item
For any other file used by a front end, this is done by adding the
filename to the @code{gtfiles} variable defined in
@file{config-lang.in}.  For C, the file is @file{c-config-lang.in}.
This list should include all files that have GTY macros in them that
are used in that front end, other than those defined in the previous
list items.  For example, it is common for front end writers to use
@file{c-common.c} and other files from the C front end, and these
should be included in the @file{gtfiles} variable for such front ends.

@end enumerate

@item
If the file was a header file, you'll need to check that it's included
in the right place to be visible to the generated files.  For a back-end
header file, this should be done automatically.  For a front-end header
file, it needs to be included by the same file that includes
@file{gtype-@var{lang}.h}.  For other header files, it needs to be
included in @file{gtype-desc.c}, which is a generated file, so add it to
@code{ifiles} in @code{open_base_file} in @file{gengtype.c}.

For source files that aren't header files, the machinery will generate a
header file that should be included in the source file you just changed.
The file will be called @file{gt-@var{path}.h} where @var{path} is the
pathname relative to the @file{gcc} directory with slashes replaced by
@verb{|-|}, so for example the header file to be included in
@file{objc/objc-parse.c} is called @file{gt-objc-objc-parse.c}.  The
generated header file should be included after everything else in the
source file.  Don't forget to mention this file as a dependency in the
@file{Makefile}!

@item
If a new @file{gt-@var{path}.h} file is needed, you need to arrange to
add a @file{Makefile} rule that will ensure this file can be built.
This is done by making it a dependency of @code{s-gtype}, like this:
@verbatim
gt-path.h : s-gtype ; @true
@end verbatim
@end enumerate

For language frontends, there is another file that needs to be included
somewhere.  It will be called @file{gtype-@var{lang}.h}, where
@var{lang} is the name of the subdirectory the language is contained in.
It will need @file{Makefile} rules just like the other generated files.