From 9733ab3f56072534b447188f48d3d5bc9911189e Mon Sep 17 00:00:00 2001 From: Stan Shebs Date: Fri, 16 Apr 1999 01:34:49 +0000 Subject: Initial creation of sourceware repository --- gdb/doc/stabs.info-2 | 1286 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1286 insertions(+) create mode 100644 gdb/doc/stabs.info-2 (limited to 'gdb/doc/stabs.info-2') diff --git a/gdb/doc/stabs.info-2 b/gdb/doc/stabs.info-2 new file mode 100644 index 0000000..7e4e40e --- /dev/null +++ b/gdb/doc/stabs.info-2 @@ -0,0 +1,1286 @@ +This is Info file stabs.info, produced by Makeinfo version 1.68 from +the input file ./stabs.texinfo. + +START-INFO-DIR-ENTRY +* Stabs: (stabs). The "stabs" debugging information format. +END-INFO-DIR-ENTRY + + This document describes the stabs debugging symbol tables. + + Copyright 1992, 93, 94, 95, 97, 1998 Free Software Foundation, Inc. +Contributed by Cygnus Support. Written by Julia Menapace, Jim Kingdon, +and David MacKenzie. + + Permission is granted to make and distribute verbatim copies of this +manual provided the copyright notice and this permission notice are +preserved on all copies. + + Permission is granted to copy or distribute modified versions of this +manual under the terms of the GPL (for which purpose this text may be +regarded as a program in the language TeX). + + +File: stabs.info, Node: Conformant Arrays, Prev: Reference Parameters, Up: Parameters + +Passing Conformant Array Parameters +----------------------------------- + + Conformant arrays are a feature of Modula-2, and perhaps other +languages, in which the size of an array parameter is not known to the +called function until run-time. Such parameters have two stabs: a `x' +for the array itself, and a `C', which represents the size of the +array. The value of the `x' stab is the offset in the argument list +where the address of the array is stored (it this right? it is a +guess); the value of the `C' stab is the offset in the argument list +where the size of the array (in elements? in bytes?) is stored. + + +File: stabs.info, Node: Types, Next: Symbol Tables, Prev: Variables, Up: Top + +Defining Types +************** + + The examples so far have described types as references to previously +defined types, or defined in terms of subranges of or pointers to +previously defined types. This chapter describes the other type +descriptors that may follow the `=' in a type definition. + +* Menu: + +* Builtin Types:: Integers, floating point, void, etc. +* Miscellaneous Types:: Pointers, sets, files, etc. +* Cross-References:: Referring to a type not yet defined. +* Subranges:: A type with a specific range. +* Arrays:: An aggregate type of same-typed elements. +* Strings:: Like an array but also has a length. +* Enumerations:: Like an integer but the values have names. +* Structures:: An aggregate type of different-typed elements. +* Typedefs:: Giving a type a name. +* Unions:: Different types sharing storage. +* Function Types:: + + +File: stabs.info, Node: Builtin Types, Next: Miscellaneous Types, Up: Types + +Builtin Types +============= + + Certain types are built in (`int', `short', `void', `float', etc.); +the debugger recognizes these types and knows how to handle them. +Thus, don't be surprised if some of the following ways of specifying +builtin types do not specify everything that a debugger would need to +know about the type--in some cases they merely specify enough +information to distinguish the type from other types. + + The traditional way to define builtin types is convolunted, so new +ways have been invented to describe them. Sun's `acc' uses special +builtin type descriptors (`b' and `R'), and IBM uses negative type +numbers. GDB accepts all three ways, as of version 4.8; dbx just +accepts the traditional builtin types and perhaps one of the other two +formats. The following sections describe each of these formats. + +* Menu: + +* Traditional Builtin Types:: Put on your seatbelts and prepare for kludgery +* Builtin Type Descriptors:: Builtin types with special type descriptors +* Negative Type Numbers:: Builtin types using negative type numbers + + +File: stabs.info, Node: Traditional Builtin Types, Next: Builtin Type Descriptors, Up: Builtin Types + +Traditional Builtin Types +------------------------- + + This is the traditional, convoluted method for defining builtin +types. There are several classes of such type definitions: integer, +floating point, and `void'. + +* Menu: + +* Traditional Integer Types:: +* Traditional Other Types:: + + +File: stabs.info, Node: Traditional Integer Types, Next: Traditional Other Types, Up: Traditional Builtin Types + +Traditional Integer Types +......................... + + Often types are defined as subranges of themselves. If the bounding +values fit within an `int', then they are given normally. For example: + + .stabs "int:t1=r1;-2147483648;2147483647;",128,0,0,0 # 128 is N_LSYM + .stabs "char:t2=r2;0;127;",128,0,0,0 + + Builtin types can also be described as subranges of `int': + + .stabs "unsigned short:t6=r1;0;65535;",128,0,0,0 + + If the lower bound of a subrange is 0 and the upper bound is -1, the +type is an unsigned integral type whose bounds are too big to describe +in an `int'. Traditionally this is only used for `unsigned int' and +`unsigned long': + + .stabs "unsigned int:t4=r1;0;-1;",128,0,0,0 + + For larger types, GCC 2.4.5 puts out bounds in octal, with one or +more leading zeroes. In this case a negative bound consists of a number +which is a 1 bit (for the sign bit) followed by a 0 bit for each bit in +the number (except the sign bit), and a positive bound is one which is a +1 bit for each bit in the number (except possibly the sign bit). All +known versions of dbx and GDB version 4 accept this (at least in the +sense of not refusing to process the file), but GDB 3.5 refuses to read +the whole file containing such symbols. So GCC 2.3.3 did not output the +proper size for these types. As an example of octal bounds, the string +fields of the stabs for 64 bit integer types look like: + + long int:t3=r1;001000000000000000000000;000777777777777777777777; + long unsigned int:t5=r1;000000000000000000000000;001777777777777777777777; + + If the lower bound of a subrange is 0 and the upper bound is +negative, the type is an unsigned integral type whose size in bytes is +the absolute value of the upper bound. I believe this is a Convex +convention for `unsigned long long'. + + If the lower bound of a subrange is negative and the upper bound is +0, the type is a signed integral type whose size in bytes is the +absolute value of the lower bound. I believe this is a Convex +convention for `long long'. To distinguish this from a legitimate +subrange, the type should be a subrange of itself. I'm not sure whether +this is the case for Convex. + + +File: stabs.info, Node: Traditional Other Types, Prev: Traditional Integer Types, Up: Traditional Builtin Types + +Traditional Other Types +....................... + + If the upper bound of a subrange is 0 and the lower bound is +positive, the type is a floating point type, and the lower bound of the +subrange indicates the number of bytes in the type: + + .stabs "float:t12=r1;4;0;",128,0,0,0 + .stabs "double:t13=r1;8;0;",128,0,0,0 + + However, GCC writes `long double' the same way it writes `double', +so there is no way to distinguish. + + .stabs "long double:t14=r1;8;0;",128,0,0,0 + + Complex types are defined the same way as floating-point types; +there is no way to distinguish a single-precision complex from a +double-precision floating-point type. + + The C `void' type is defined as itself: + + .stabs "void:t15=15",128,0,0,0 + + I'm not sure how a boolean type is represented. + + +File: stabs.info, Node: Builtin Type Descriptors, Next: Negative Type Numbers, Prev: Traditional Builtin Types, Up: Builtin Types + +Defining Builtin Types Using Builtin Type Descriptors +----------------------------------------------------- + + This is the method used by Sun's `acc' for defining builtin types. +These are the type descriptors to define builtin types: + +`b SIGNED CHAR-FLAG WIDTH ; OFFSET ; NBITS ;' + Define an integral type. SIGNED is `u' for unsigned or `s' for + signed. CHAR-FLAG is `c' which indicates this is a character + type, or is omitted. I assume this is to distinguish an integral + type from a character type of the same size, for example it might + make sense to set it for the C type `wchar_t' so the debugger can + print such variables differently (Solaris does not do this). Sun + sets it on the C types `signed char' and `unsigned char' which + arguably is wrong. WIDTH and OFFSET appear to be for small + objects stored in larger ones, for example a `short' in an `int' + register. WIDTH is normally the number of bytes in the type. + OFFSET seems to always be zero. NBITS is the number of bits in + the type. + + Note that type descriptor `b' used for builtin types conflicts with + its use for Pascal space types (*note Miscellaneous Types::.); + they can be distinguished because the character following the type + descriptor will be a digit, `(', or `-' for a Pascal space type, or + `u' or `s' for a builtin type. + +`w' + Documented by AIX to define a wide character type, but their + compiler actually uses negative type numbers (*note Negative Type + Numbers::.). + +`R FP-TYPE ; BYTES ;' + Define a floating point type. FP-TYPE has one of the following + values: + + `1 (NF_SINGLE)' + IEEE 32-bit (single precision) floating point format. + + `2 (NF_DOUBLE)' + IEEE 64-bit (double precision) floating point format. + + `3 (NF_COMPLEX)' + + `4 (NF_COMPLEX16)' + + `5 (NF_COMPLEX32)' + These are for complex numbers. A comment in the GDB source + describes them as Fortran `complex', `double complex', and + `complex*16', respectively, but what does that mean? (i.e., + Single precision? Double precison?). + + `6 (NF_LDOUBLE)' + Long double. This should probably only be used for Sun format + `long double', and new codes should be used for other floating + point formats (`NF_DOUBLE' can be used if a `long double' is + really just an IEEE double, of course). + + BYTES is the number of bytes occupied by the type. This allows a + debugger to perform some operations with the type even if it + doesn't understand FP-TYPE. + +`g TYPE-INFORMATION ; NBITS' + Documented by AIX to define a floating type, but their compiler + actually uses negative type numbers (*note Negative Type + Numbers::.). + +`c TYPE-INFORMATION ; NBITS' + Documented by AIX to define a complex type, but their compiler + actually uses negative type numbers (*note Negative Type + Numbers::.). + + The C `void' type is defined as a signed integral type 0 bits long: + .stabs "void:t19=bs0;0;0",128,0,0,0 + The Solaris compiler seems to omit the trailing semicolon in this +case. Getting sloppy in this way is not a swift move because if a type +is embedded in a more complex expression it is necessary to be able to +tell where it ends. + + I'm not sure how a boolean type is represented. + + +File: stabs.info, Node: Negative Type Numbers, Prev: Builtin Type Descriptors, Up: Builtin Types + +Negative Type Numbers +--------------------- + + This is the method used in XCOFF for defining builtin types. Since +the debugger knows about the builtin types anyway, the idea of negative +type numbers is simply to give a special type number which indicates +the builtin type. There is no stab defining these types. + + There are several subtle issues with negative type numbers. + + One is the size of the type. A builtin type (for example the C types +`int' or `long') might have different sizes depending on compiler +options, the target architecture, the ABI, etc. This issue doesn't +come up for IBM tools since (so far) they just target the RS/6000; the +sizes indicated below for each size are what the IBM RS/6000 tools use. +To deal with differing sizes, either define separate negative type +numbers for each size (which works but requires changing the debugger, +and, unless you get both AIX dbx and GDB to accept the change, +introduces an incompatibility), or use a type attribute (*note String +Field::.) to define a new type with the appropriate size (which merely +requires a debugger which understands type attributes, like AIX dbx or +GDB). For example, + + .stabs "boolean:t10=@s8;-16",128,0,0,0 + + defines an 8-bit boolean type, and + + .stabs "boolean:t10=@s64;-16",128,0,0,0 + + defines a 64-bit boolean type. + + A similar issue is the format of the type. This comes up most often +for floating-point types, which could have various formats (particularly +extended doubles, which vary quite a bit even among IEEE systems). +Again, it is best to define a new negative type number for each +different format; changing the format based on the target system has +various problems. One such problem is that the Alpha has both VAX and +IEEE floating types. One can easily imagine one library using the VAX +types and another library in the same executable using the IEEE types. +Another example is that the interpretation of whether a boolean is true +or false can be based on the least significant bit, most significant +bit, whether it is zero, etc., and different compilers (or different +options to the same compiler) might provide different kinds of boolean. + + The last major issue is the names of the types. The name of a given +type depends *only* on the negative type number given; these do not +vary depending on the language, the target system, or anything else. +One can always define separate type numbers--in the following list you +will see for example separate `int' and `integer*4' types which are +identical except for the name. But compatibility can be maintained by +not inventing new negative type numbers and instead just defining a new +type with a new name. For example: + + .stabs "CARDINAL:t10=-8",128,0,0,0 + + Here is the list of negative type numbers. The phrase "integral +type" is used to mean twos-complement (I strongly suspect that all +machines which use stabs use twos-complement; most machines use +twos-complement these days). + +`-1' + `int', 32 bit signed integral type. + +`-2' + `char', 8 bit type holding a character. Both GDB and dbx on AIX + treat this as signed. GCC uses this type whether `char' is signed + or not, which seems like a bad idea. The AIX compiler (`xlc') + seems to avoid this type; it uses -5 instead for `char'. + +`-3' + `short', 16 bit signed integral type. + +`-4' + `long', 32 bit signed integral type. + +`-5' + `unsigned char', 8 bit unsigned integral type. + +`-6' + `signed char', 8 bit signed integral type. + +`-7' + `unsigned short', 16 bit unsigned integral type. + +`-8' + `unsigned int', 32 bit unsigned integral type. + +`-9' + `unsigned', 32 bit unsigned integral type. + +`-10' + `unsigned long', 32 bit unsigned integral type. + +`-11' + `void', type indicating the lack of a value. + +`-12' + `float', IEEE single precision. + +`-13' + `double', IEEE double precision. + +`-14' + `long double', IEEE double precision. The compiler claims the size + will increase in a future release, and for binary compatibility + you have to avoid using `long double'. I hope when they increase + it they use a new negative type number. + +`-15' + `integer'. 32 bit signed integral type. + +`-16' + `boolean'. 32 bit type. GDB and GCC assume that zero is false, + one is true, and other values have unspecified meaning. I hope + this agrees with how the IBM tools use the type. + +`-17' + `short real'. IEEE single precision. + +`-18' + `real'. IEEE double precision. + +`-19' + `stringptr'. *Note Strings::. + +`-20' + `character', 8 bit unsigned character type. + +`-21' + `logical*1', 8 bit type. This Fortran type has a split + personality in that it is used for boolean variables, but can also + be used for unsigned integers. 0 is false, 1 is true, and other + values are non-boolean. + +`-22' + `logical*2', 16 bit type. This Fortran type has a split + personality in that it is used for boolean variables, but can also + be used for unsigned integers. 0 is false, 1 is true, and other + values are non-boolean. + +`-23' + `logical*4', 32 bit type. This Fortran type has a split + personality in that it is used for boolean variables, but can also + be used for unsigned integers. 0 is false, 1 is true, and other + values are non-boolean. + +`-24' + `logical', 32 bit type. This Fortran type has a split personality + in that it is used for boolean variables, but can also be used for + unsigned integers. 0 is false, 1 is true, and other values are + non-boolean. + +`-25' + `complex'. A complex type consisting of two IEEE single-precision + floating point values. + +`-26' + `complex'. A complex type consisting of two IEEE double-precision + floating point values. + +`-27' + `integer*1', 8 bit signed integral type. + +`-28' + `integer*2', 16 bit signed integral type. + +`-29' + `integer*4', 32 bit signed integral type. + +`-30' + `wchar'. Wide character, 16 bits wide, unsigned (what format? + Unicode?). + +`-31' + `long long', 64 bit signed integral type. + +`-32' + `unsigned long long', 64 bit unsigned integral type. + +`-33' + `logical*8', 64 bit unsigned integral type. + +`-34' + `integer*8', 64 bit signed integral type. + + +File: stabs.info, Node: Miscellaneous Types, Next: Cross-References, Prev: Builtin Types, Up: Types + +Miscellaneous Types +=================== + +`b TYPE-INFORMATION ; BYTES' + Pascal space type. This is documented by IBM; what does it mean? + + This use of the `b' type descriptor can be distinguished from its + use for builtin integral types (*note Builtin Type Descriptors::.) + because the character following the type descriptor is always a + digit, `(', or `-'. + +`B TYPE-INFORMATION' + A volatile-qualified version of TYPE-INFORMATION. This is a Sun + extension. References and stores to a variable with a + volatile-qualified type must not be optimized or cached; they must + occur as the user specifies them. + +`d TYPE-INFORMATION' + File of type TYPE-INFORMATION. As far as I know this is only used + by Pascal. + +`k TYPE-INFORMATION' + A const-qualified version of TYPE-INFORMATION. This is a Sun + extension. A variable with a const-qualified type cannot be + modified. + +`M TYPE-INFORMATION ; LENGTH' + Multiple instance type. The type seems to composed of LENGTH + repetitions of TYPE-INFORMATION, for example `character*3' is + represented by `M-2;3', where `-2' is a reference to a character + type (*note Negative Type Numbers::.). I'm not sure how this + differs from an array. This appears to be a Fortran feature. + LENGTH is a bound, like those in range types; see *Note + Subranges::. + +`S TYPE-INFORMATION' + Pascal set type. TYPE-INFORMATION must be a small type such as an + enumeration or a subrange, and the type is a bitmask whose length + is specified by the number of elements in TYPE-INFORMATION. + + In CHILL, if it is a bitstring instead of a set, also use the `S' + type attribute (*note String Field::.). + +`* TYPE-INFORMATION' + Pointer to TYPE-INFORMATION. + + +File: stabs.info, Node: Cross-References, Next: Subranges, Prev: Miscellaneous Types, Up: Types + +Cross-References to Other Types +=============================== + + A type can be used before it is defined; one common way to deal with +that situation is just to use a type reference to a type which has not +yet been defined. + + Another way is with the `x' type descriptor, which is followed by +`s' for a structure tag, `u' for a union tag, or `e' for a enumerator +tag, followed by the name of the tag, followed by `:'. If the name +contains `::' between a `<' and `>' pair (for C++ templates), such a +`::' does not end the name--only a single `:' ends the name; see *Note +Nested Symbols::. + + For example, the following C declarations: + + struct foo; + struct foo *bar; + +produce: + + .stabs "bar:G16=*17=xsfoo:",32,0,0,0 + + Not all debuggers support the `x' type descriptor, so on some +machines GCC does not use it. I believe that for the above example it +would just emit a reference to type 17 and never define it, but I +haven't verified that. + + Modula-2 imported types, at least on AIX, use the `i' type +descriptor, which is followed by the name of the module from which the +type is imported, followed by `:', followed by the name of the type. +There is then optionally a comma followed by type information for the +type. This differs from merely naming the type (*note Typedefs::.) in +that it identifies the module; I don't understand whether the name of +the type given here is always just the same as the name we are giving +it, or whether this type descriptor is used with a nameless stab (*note +String Field::.), or what. The symbol ends with `;'. + + +File: stabs.info, Node: Subranges, Next: Arrays, Prev: Cross-References, Up: Types + +Subrange Types +============== + + The `r' type descriptor defines a type as a subrange of another +type. It is followed by type information for the type of which it is a +subrange, a semicolon, an integral lower bound, a semicolon, an +integral upper bound, and a semicolon. The AIX documentation does not +specify the trailing semicolon, in an effort to specify array indexes +more cleanly, but a subrange which is not an array index has always +included a trailing semicolon (*note Arrays::.). + + Instead of an integer, either bound can be one of the following: + +`A OFFSET' + The bound is passed by reference on the stack at offset OFFSET + from the argument list. *Note Parameters::, for more information + on such offsets. + +`T OFFSET' + The bound is passed by value on the stack at offset OFFSET from + the argument list. + +`a REGISTER-NUMBER' + The bound is pased by reference in register number REGISTER-NUMBER. + +`t REGISTER-NUMBER' + The bound is passed by value in register number REGISTER-NUMBER. + +`J' + There is no bound. + + Subranges are also used for builtin types; see *Note Traditional +Builtin Types::. + + +File: stabs.info, Node: Arrays, Next: Strings, Prev: Subranges, Up: Types + +Array Types +=========== + + Arrays use the `a' type descriptor. Following the type descriptor +is the type of the index and the type of the array elements. If the +index type is a range type, it ends in a semicolon; otherwise (for +example, if it is a type reference), there does not appear to be any +way to tell where the types are separated. In an effort to clean up +this mess, IBM documents the two types as being separated by a +semicolon, and a range type as not ending in a semicolon (but this is +not right for range types which are not array indexes, *note +Subranges::.). I think probably the best solution is to specify that a +semicolon ends a range type, and that the index type and element type +of an array are separated by a semicolon, but that if the index type is +a range type, the extra semicolon can be omitted. GDB (at least +through version 4.9) doesn't support any kind of index type other than a +range anyway; I'm not sure about dbx. + + It is well established, and widely used, that the type of the index, +unlike most types found in the stabs, is merely a type definition, not +type information (*note String Field::.) (that is, it need not start +with `TYPE-NUMBER=' if it is defining a new type). According to a +comment in GDB, this is also true of the type of the array elements; it +gives `ar1;1;10;ar1;1;10;4' as a legitimate way to express a two +dimensional array. According to AIX documentation, the element type +must be type information. GDB accepts either. + + The type of the index is often a range type, expressed as the type +descriptor `r' and some parameters. It defines the size of the array. +In the example below, the range `r1;0;2;' defines an index type which +is a subrange of type 1 (integer), with a lower bound of 0 and an upper +bound of 2. This defines the valid range of subscripts of a +three-element C array. + + For example, the definition: + + char char_vec[3] = {'a','b','c'}; + +produces the output: + + .stabs "char_vec:G19=ar1;0;2;2",32,0,0,0 + .global _char_vec + .align 4 + _char_vec: + .byte 97 + .byte 98 + .byte 99 + + If an array is "packed", the elements are spaced more closely than +normal, saving memory at the expense of speed. For example, an array +of 3-byte objects might, if unpacked, have each element aligned on a +4-byte boundary, but if packed, have no padding. One way to specify +that something is packed is with type attributes (*note String +Field::.). In the case of arrays, another is to use the `P' type +descriptor instead of `a'. Other than specifying a packed array, `P' +is identical to `a'. + + An open array is represented by the `A' type descriptor followed by +type information specifying the type of the array elements. + + An N-dimensional dynamic array is represented by + + D DIMENSIONS ; TYPE-INFORMATION + + DIMENSIONS is the number of dimensions; TYPE-INFORMATION specifies +the type of the array elements. + + A subarray of an N-dimensional array is represented by + + E DIMENSIONS ; TYPE-INFORMATION + + DIMENSIONS is the number of dimensions; TYPE-INFORMATION specifies +the type of the array elements. + + +File: stabs.info, Node: Strings, Next: Enumerations, Prev: Arrays, Up: Types + +Strings +======= + + Some languages, like C or the original Pascal, do not have string +types, they just have related things like arrays of characters. But +most Pascals and various other languages have string types, which are +indicated as follows: + +`n TYPE-INFORMATION ; BYTES' + BYTES is the maximum length. I'm not sure what TYPE-INFORMATION + is; I suspect that it means that this is a string of + TYPE-INFORMATION (thus allowing a string of integers, a string of + wide characters, etc., as well as a string of characters). Not + sure what the format of this type is. This is an AIX feature. + +`z TYPE-INFORMATION ; BYTES' + Just like `n' except that this is a gstring, not an ordinary + string. I don't know the difference. + +`N' + Pascal Stringptr. What is this? This is an AIX feature. + + Languages, such as CHILL which have a string type which is basically +just an array of characters use the `S' type attribute (*note String +Field::.). + + +File: stabs.info, Node: Enumerations, Next: Structures, Prev: Strings, Up: Types + +Enumerations +============ + + Enumerations are defined with the `e' type descriptor. + + The source line below declares an enumeration type at file scope. +The type definition is located after the `N_RBRAC' that marks the end of +the previous procedure's block scope, and before the `N_FUN' that marks +the beginning of the next procedure's block scope. Therefore it does +not describe a block local symbol, but a file local one. + + The source line: + + enum e_places {first,second=3,last}; + +generates the following stab: + + .stabs "e_places:T22=efirst:0,second:3,last:4,;",128,0,0,0 + + The symbol descriptor (`T') says that the stab describes a +structure, enumeration, or union tag. The type descriptor `e', +following the `22=' of the type definition narrows it down to an +enumeration type. Following the `e' is a list of the elements of the +enumeration. The format is `NAME:VALUE,'. The list of elements ends +with `;'. The fact that VALUE is specified as an integer can cause +problems if the value is large. GCC 2.5.2 tries to output it in octal +in that case with a leading zero, which is probably a good thing, +although GDB 4.11 supports octal only in cases where decimal is +perfectly good. Negative decimal values are supported by both GDB and +dbx. + + There is no standard way to specify the size of an enumeration type; +it is determined by the architecture (normally all enumerations types +are 32 bits). Type attributes can be used to specify an enumeration +type of another size for debuggers which support them; see *Note String +Field::. + + Enumeration types are unusual in that they define symbols for the +enumeration values (`first', `second', and `third' in the above +example), and even though these symbols are visible in the file as a +whole (rather than being in a more local namespace like structure +member names), they are defined in the type definition for the +enumeration type rather than each having their own symbol. In order to +be fast, GDB will only get symbols from such types (in its initial scan +of the stabs) if the type is the first thing defined after a `T' or `t' +symbol descriptor (the above example fulfills this requirement). If +the type does not have a name, the compiler should emit it in a +nameless stab (*note String Field::.); GCC does this. + + +File: stabs.info, Node: Structures, Next: Typedefs, Prev: Enumerations, Up: Types + +Structures +========== + + The encoding of structures in stabs can be shown with an example. + + The following source code declares a structure tag and defines an +instance of the structure in global scope. Then a `typedef' equates the +structure tag with a new type. Seperate stabs are generated for the +structure tag, the structure `typedef', and the structure instance. The +stabs for the tag and the `typedef' are emited when the definitions are +encountered. Since the structure elements are not initialized, the +stab and code for the structure variable itself is located at the end +of the program in the bss section. + + struct s_tag { + int s_int; + float s_float; + char s_char_vec[8]; + struct s_tag* s_next; + } g_an_s; + + typedef struct s_tag s_typedef; + + The structure tag has an `N_LSYM' stab type because, like the +enumeration, the symbol has file scope. Like the enumeration, the +symbol descriptor is `T', for enumeration, structure, or tag type. The +type descriptor `s' following the `16=' of the type definition narrows +the symbol type to structure. + + Following the `s' type descriptor is the number of bytes the +structure occupies, followed by a description of each structure element. +The structure element descriptions are of the form NAME:TYPE, BIT +OFFSET FROM THE START OF THE STRUCT, NUMBER OF BITS IN THE ELEMENT. + + # 128 is N_LSYM + .stabs "s_tag:T16=s20s_int:1,0,32;s_float:12,32,32; + s_char_vec:17=ar1;0;7;2,64,64;s_next:18=*16,128,32;;",128,0,0,0 + + In this example, the first two structure elements are previously +defined types. For these, the type following the `NAME:' part of the +element description is a simple type reference. The other two structure +elements are new types. In this case there is a type definition +embedded after the `NAME:'. The type definition for the array element +looks just like a type definition for a standalone array. The `s_next' +field is a pointer to the same kind of structure that the field is an +element of. So the definition of structure type 16 contains a type +definition for an element which is a pointer to type 16. + + If a field is a static member (this is a C++ feature in which a +single variable appears to be a field of every structure of a given +type) it still starts out with the field name, a colon, and the type, +but then instead of a comma, bit position, comma, and bit size, there +is a colon followed by the name of the variable which each such field +refers to. + + If the structure has methods (a C++ feature), they follow the +non-method fields; see *Note Cplusplus::. + + +File: stabs.info, Node: Typedefs, Next: Unions, Prev: Structures, Up: Types + +Giving a Type a Name +==================== + + To give a type a name, use the `t' symbol descriptor. The type is +specified by the type information (*note String Field::.) for the stab. +For example, + + .stabs "s_typedef:t16",128,0,0,0 # 128 is N_LSYM + + specifies that `s_typedef' refers to type number 16. Such stabs +have symbol type `N_LSYM' (or `C_DECL' for XCOFF). (The Sun +documentation mentions using `N_GSYM' in some cases). + + If you are specifying the tag name for a structure, union, or +enumeration, use the `T' symbol descriptor instead. I believe C is the +only language with this feature. + + If the type is an opaque type (I believe this is a Modula-2 feature), +AIX provides a type descriptor to specify it. The type descriptor is +`o' and is followed by a name. I don't know what the name means--is it +always the same as the name of the type, or is this type descriptor +used with a nameless stab (*note String Field::.)? There optionally +follows a comma followed by type information which defines the type of +this type. If omitted, a semicolon is used in place of the comma and +the type information, and the type is much like a generic pointer +type--it has a known size but little else about it is specified. + + +File: stabs.info, Node: Unions, Next: Function Types, Prev: Typedefs, Up: Types + +Unions +====== + + union u_tag { + int u_int; + float u_float; + char* u_char; + } an_u; + + This code generates a stab for a union tag and a stab for a union +variable. Both use the `N_LSYM' stab type. If a union variable is +scoped locally to the procedure in which it is defined, its stab is +located immediately preceding the `N_LBRAC' for the procedure's block +start. + + The stab for the union tag, however, is located preceding the code +for the procedure in which it is defined. The stab type is `N_LSYM'. +This would seem to imply that the union type is file scope, like the +struct type `s_tag'. This is not true. The contents and position of +the stab for `u_type' do not convey any infomation about its procedure +local scope. + + # 128 is N_LSYM + .stabs "u_tag:T23=u4u_int:1,0,32;u_float:12,0,32;u_char:21,0,32;;", + 128,0,0,0 + + The symbol descriptor `T', following the `name:' means that the stab +describes an enumeration, structure, or union tag. The type descriptor +`u', following the `23=' of the type definition, narrows it down to a +union type definition. Following the `u' is the number of bytes in the +union. After that is a list of union element descriptions. Their +format is NAME:TYPE, BIT OFFSET INTO THE UNION, NUMBER OF BYTES FOR THE +ELEMENT;. + + The stab for the union variable is: + + .stabs "an_u:23",128,0,0,-20 # 128 is N_LSYM + + `-20' specifies where the variable is stored (*note Stack +Variables::.). + + +File: stabs.info, Node: Function Types, Prev: Unions, Up: Types + +Function Types +============== + + Various types can be defined for function variables. These types are +not used in defining functions (*note Procedures::.); they are used for +things like pointers to functions. + + The simple, traditional, type is type descriptor `f' is followed by +type information for the return type of the function, followed by a +semicolon. + + This does not deal with functions for which the number and types of +the parameters are part of the type, as in Modula-2 or ANSI C. AIX +provides extensions to specify these, using the `f', `F', `p', and `R' +type descriptors. + + First comes the type descriptor. If it is `f' or `F', this type +involves a function rather than a procedure, and the type information +for the return type of the function follows, followed by a comma. Then +comes the number of parameters to the function and a semicolon. Then, +for each parameter, there is the name of the parameter followed by a +colon (this is only present for type descriptors `R' and `F' which +represent Pascal function or procedure parameters), type information +for the parameter, a comma, 0 if passed by reference or 1 if passed by +value, and a semicolon. The type definition ends with a semicolon. + + For example, this variable definition: + + int (*g_pf)(); + +generates the following code: + + .stabs "g_pf:G24=*25=f1",32,0,0,0 + .common _g_pf,4,"bss" + + The variable defines a new type, 24, which is a pointer to another +new type, 25, which is a function returning `int'. + + +File: stabs.info, Node: Symbol Tables, Next: Cplusplus, Prev: Types, Up: Top + +Symbol Information in Symbol Tables +*********************************** + + This chapter describes the format of symbol table entries and how +stab assembler directives map to them. It also describes the +transformations that the assembler and linker make on data from stabs. + +* Menu: + +* Symbol Table Format:: +* Transformations On Symbol Tables:: + + +File: stabs.info, Node: Symbol Table Format, Next: Transformations On Symbol Tables, Up: Symbol Tables + +Symbol Table Format +=================== + + Each time the assembler encounters a stab directive, it puts each +field of the stab into a corresponding field in a symbol table entry of +its output file. If the stab contains a string field, the symbol table +entry for that stab points to a string table entry containing the +string data from the stab. Assembler labels become relocatable +addresses. Symbol table entries in a.out have the format: + + struct internal_nlist { + unsigned long n_strx; /* index into string table of name */ + unsigned char n_type; /* type of symbol */ + unsigned char n_other; /* misc info (usually empty) */ + unsigned short n_desc; /* description field */ + bfd_vma n_value; /* value of symbol */ + }; + + If the stab has a string, the `n_strx' field holds the offset in +bytes of the string within the string table. The string is terminated +by a NUL character. If the stab lacks a string (for example, it was +produced by a `.stabn' or `.stabd' directive), the `n_strx' field is +zero. + + Symbol table entries with `n_type' field values greater than 0x1f +originated as stabs generated by the compiler (with one random +exception). The other entries were placed in the symbol table of the +executable by the assembler or the linker. + + +File: stabs.info, Node: Transformations On Symbol Tables, Prev: Symbol Table Format, Up: Symbol Tables + +Transformations on Symbol Tables +================================ + + The linker concatenates object files and does fixups of externally +defined symbols. + + You can see the transformations made on stab data by the assembler +and linker by examining the symbol table after each pass of the build. +To do this, use `nm -ap', which dumps the symbol table, including +debugging information, unsorted. For stab entries the columns are: +VALUE, OTHER, DESC, TYPE, STRING. For assembler and linker symbols, +the columns are: VALUE, TYPE, STRING. + + The low 5 bits of the stab type tell the linker how to relocate the +value of the stab. Thus for stab types like `N_RSYM' and `N_LSYM', +where the value is an offset or a register number, the low 5 bits are +`N_ABS', which tells the linker not to relocate the value. + + Where the value of a stab contains an assembly language label, it is +transformed by each build step. The assembler turns it into a +relocatable address and the linker turns it into an absolute address. + +* Menu: + +* Transformations On Static Variables:: +* Transformations On Global Variables:: +* Stab Section Transformations:: For some object file formats, + things are a bit different. + + +File: stabs.info, Node: Transformations On Static Variables, Next: Transformations On Global Variables, Up: Transformations On Symbol Tables + +Transformations on Static Variables +----------------------------------- + + This source line defines a static variable at file scope: + + static int s_g_repeat + +The following stab describes the symbol: + + .stabs "s_g_repeat:S1",38,0,0,_s_g_repeat + +The assembler transforms the stab into this symbol table entry in the +`.o' file. The location is expressed as a data segment offset. + + 00000084 - 00 0000 STSYM s_g_repeat:S1 + +In the symbol table entry from the executable, the linker has made the +relocatable address absolute. + + 0000e00c - 00 0000 STSYM s_g_repeat:S1 + + +File: stabs.info, Node: Transformations On Global Variables, Next: Stab Section Transformations, Prev: Transformations On Static Variables, Up: Transformations On Symbol Tables + +Transformations on Global Variables +----------------------------------- + + Stabs for global variables do not contain location information. In +this case, the debugger finds location information in the assembler or +linker symbol table entry describing the variable. The source line: + + char g_foo = 'c'; + +generates the stab: + + .stabs "g_foo:G2",32,0,0,0 + + The variable is represented by two symbol table entries in the object +file (see below). The first one originated as a stab. The second one +is an external symbol. The upper case `D' signifies that the `n_type' +field of the symbol table contains 7, `N_DATA' with local linkage. The +stab's value is zero since the value is not used for `N_GSYM' stabs. +The value of the linker symbol is the relocatable address corresponding +to the variable. + + 00000000 - 00 0000 GSYM g_foo:G2 + 00000080 D _g_foo + +These entries as transformed by the linker. The linker symbol table +entry now holds an absolute address: + + 00000000 - 00 0000 GSYM g_foo:G2 + ... + 0000e008 D _g_foo + + +File: stabs.info, Node: Stab Section Transformations, Prev: Transformations On Global Variables, Up: Transformations On Symbol Tables + +Transformations of Stabs in separate sections +--------------------------------------------- + + For object file formats using stabs in separate sections (*note Stab +Sections::.), use `objdump --stabs' instead of `nm' to show the stabs +in an object or executable file. `objdump' is a GNU utility; Sun does +not provide any equivalent. + + The following example is for a stab whose value is an address is +relative to the compilation unit (*note ELF Linker Relocation::.). For +example, if the source line + + static int ld = 5; + + appears within a function, then the assembly language output from the +compiler contains: + + .Ddata.data: + ... + .stabs "ld:V(0,3)",0x26,0,4,.L18-Ddata.data # 0x26 is N_STSYM + ... + .L18: + .align 4 + .word 0x5 + + Because the value is formed by subtracting one symbol from another, +the value is absolute, not relocatable, and so the object file contains + + Symnum n_type n_othr n_desc n_value n_strx String + 31 STSYM 0 4 00000004 680 ld:V(0,3) + + without any relocations, and the executable file also contains + + Symnum n_type n_othr n_desc n_value n_strx String + 31 STSYM 0 4 00000004 680 ld:V(0,3) + + +File: stabs.info, Node: Cplusplus, Next: Stab Types, Prev: Symbol Tables, Up: Top + +GNU C++ Stabs +************* + +* Menu: + +* Class Names:: C++ class names are both tags and typedefs. +* Nested Symbols:: C++ symbol names can be within other types. +* Basic Cplusplus Types:: +* Simple Classes:: +* Class Instance:: +* Methods:: Method definition +* Method Type Descriptor:: The `#' type descriptor +* Member Type Descriptor:: The `@' type descriptor +* Protections:: +* Method Modifiers:: +* Virtual Methods:: +* Inheritence:: +* Virtual Base Classes:: +* Static Members:: + + +File: stabs.info, Node: Class Names, Next: Nested Symbols, Up: Cplusplus + +C++ Class Names +=============== + + In C++, a class name which is declared with `class', `struct', or +`union', is not only a tag, as in C, but also a type name. Thus there +should be stabs with both `t' and `T' symbol descriptors (*note +Typedefs::.). + + To save space, there is a special abbreviation for this case. If the +`T' symbol descriptor is followed by `t', then the stab defines both a +type name and a tag. + + For example, the C++ code + + struct foo {int x;}; + + can be represented as either + + .stabs "foo:T19=s4x:1,0,32;;",128,0,0,0 # 128 is N_LSYM + .stabs "foo:t19",128,0,0,0 + + or + + .stabs "foo:Tt19=s4x:1,0,32;;",128,0,0,0 + + +File: stabs.info, Node: Nested Symbols, Next: Basic Cplusplus Types, Prev: Class Names, Up: Cplusplus + +Defining a Symbol Within Another Type +===================================== + + In C++, a symbol (such as a type name) can be defined within another +type. + + In stabs, this is sometimes represented by making the name of a +symbol which contains `::'. Such a pair of colons does not end the name +of the symbol, the way a single colon would (*note String Field::.). +I'm not sure how consistently used or well thought out this mechanism +is. So that a pair of colons in this position always has this meaning, +`:' cannot be used as a symbol descriptor. + + For example, if the string for a stab is `foo::bar::baz:t5=*6', then +`foo::bar::baz' is the name of the symbol, `t' is the symbol +descriptor, and `5=*6' is the type information. + + +File: stabs.info, Node: Basic Cplusplus Types, Next: Simple Classes, Prev: Nested Symbols, Up: Cplusplus + +Basic Types For C++ +=================== + + << the examples that follow are based on a01.C >> + + C++ adds two more builtin types to the set defined for C. These are +the unknown type and the vtable record type. The unknown type, type +16, is defined in terms of itself like the void type. + + The vtable record type, type 17, is defined as a structure type and +then as a structure tag. The structure has four fields: delta, index, +pfn, and delta2. pfn is the function pointer. + + << In boilerplate $vtbl_ptr_type, what are the fields delta, index, +and delta2 used for? >> + + This basic type is present in all C++ programs even if there are no +virtual methods defined. + + .stabs "struct_name:sym_desc(type)type_def(17)=type_desc(struct)struct_bytes(8) + elem_name(delta):type_ref(short int),bit_offset(0),field_bits(16); + elem_name(index):type_ref(short int),bit_offset(16),field_bits(16); + elem_name(pfn):type_def(18)=type_desc(ptr to)type_ref(void), + bit_offset(32),field_bits(32); + elem_name(delta2):type_def(short int);bit_offset(32),field_bits(16);;" + N_LSYM, NIL, NIL + + .stabs "$vtbl_ptr_type:t17=s8 + delta:6,0,16;index:6,16,16;pfn:18=*15,32,32;delta2:6,32,16;;" + ,128,0,0,0 + + .stabs "name:sym_dec(struct tag)type_ref($vtbl_ptr_type)",N_LSYM,NIL,NIL,NIL + + .stabs "$vtbl_ptr_type:T17",128,0,0,0 + + +File: stabs.info, Node: Simple Classes, Next: Class Instance, Prev: Basic Cplusplus Types, Up: Cplusplus + +Simple Class Definition +======================= + + The stabs describing C++ language features are an extension of the +stabs describing C. Stabs representing C++ class types elaborate +extensively on the stab format used to describe structure types in C. +Stabs representing class type variables look just like stabs +representing C language variables. + + Consider the following very simple class definition. + + class baseA { + public: + int Adat; + int Ameth(int in, char other); + }; + + The class `baseA' is represented by two stabs. The first stab +describes the class as a structure type. The second stab describes a +structure tag of the class type. Both stabs are of stab type `N_LSYM'. +Since the stab is not located between an `N_FUN' and an `N_LBRAC' stab +this indicates that the class is defined at file scope. If it were, +then the `N_LSYM' would signify a local variable. + + A stab describing a C++ class type is similar in format to a stab +describing a C struct, with each class member shown as a field in the +structure. The part of the struct format describing fields is expanded +to include extra information relevent to C++ class members. In +addition, if the class has multiple base classes or virtual functions +the struct format outside of the field parts is also augmented. + + In this simple example the field part of the C++ class stab +representing member data looks just like the field part of a C struct +stab. The section on protections describes how its format is sometimes +extended for member data. + + The field part of a C++ class stab representing a member function +differs substantially from the field part of a C struct stab. It still +begins with `name:' but then goes on to define a new type number for +the member function, describe its return type, its argument types, its +protection level, any qualifiers applied to the method definition, and +whether the method is virtual or not. If the method is virtual then +the method description goes on to give the vtable index of the method, +and the type number of the first base class defining the method. + + When the field name is a method name it is followed by two colons +rather than one. This is followed by a new type definition for the +method. This is a number followed by an equal sign and the type of the +method. Normally this will be a type declared using the `#' type +descriptor; see *Note Method Type Descriptor::; static member functions +are declared using the `f' type descriptor instead; see *Note Function +Types::. + + The format of an overloaded operator method name differs from that of +other methods. It is `op$::OPERATOR-NAME.' where OPERATOR-NAME is the +operator name such as `+' or `+='. The name ends with a period, and +any characters except the period can occur in the OPERATOR-NAME string. + + The next part of the method description represents the arguments to +the method, preceeded by a colon and ending with a semi-colon. The +types of the arguments are expressed in the same way argument types are +expressed in C++ name mangling. In this example an `int' and a `char' +map to `ic'. + + This is followed by a number, a letter, and an asterisk or period, +followed by another semicolon. The number indicates the protections +that apply to the member function. Here the 2 means public. The +letter encodes any qualifier applied to the method definition. In this +case, `A' means that it is a normal function definition. The dot shows +that the method is not virtual. The sections that follow elaborate +further on these fields and describe the additional information present +for virtual methods. + + .stabs "class_name:sym_desc(type)type_def(20)=type_desc(struct)struct_bytes(4) + field_name(Adat):type(int),bit_offset(0),field_bits(32); + + method_name(Ameth)::type_def(21)=type_desc(method)return_type(int); + :arg_types(int char); + protection(public)qualifier(normal)virtual(no);;" + N_LSYM,NIL,NIL,NIL + + .stabs "baseA:t20=s4Adat:1,0,32;Ameth::21=##1;:ic;2A.;;",128,0,0,0 + + .stabs "class_name:sym_desc(struct tag)",N_LSYM,NIL,NIL,NIL + + .stabs "baseA:T20",128,0,0,0 + -- cgit v1.1