aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc/objc.texi
diff options
context:
space:
mode:
Diffstat (limited to 'gcc/doc/objc.texi')
-rw-r--r--gcc/doc/objc.texi182
1 files changed, 170 insertions, 12 deletions
diff --git a/gcc/doc/objc.texi b/gcc/doc/objc.texi
index 1beb748..bf32879 100644
--- a/gcc/doc/objc.texi
+++ b/gcc/doc/objc.texi
@@ -170,9 +170,13 @@ above apply to classes defined in bundle.
@node Type encoding
@section Type encoding
-The Objective-C compiler generates type encodings for all the
-types. These type encodings are used at runtime to find out information
-about selectors and methods and about objects and classes.
+This is an advanced section. Type encodings are used extensively by
+the compiler and by the runtime, but you generally do not need to know
+about them to use Objective-C.
+
+The Objective-C compiler generates type encodings for all the types.
+These type encodings are used at runtime to find out information about
+selectors and methods and about objects and classes.
The types are encoded in the following way:
@@ -205,6 +209,8 @@ The types are encoded in the following way:
@tab @code{f}
@item @code{double}
@tab @code{d}
+@item @code{long double}
+@tab @code{D}
@item @code{void}
@tab @code{v}
@item @code{id}
@@ -215,6 +221,9 @@ The types are encoded in the following way:
@tab @code{:}
@item @code{char*}
@tab @code{*}
+@item @code{enum}
+@tab an @code{enum} is encoded exactly as the integer type that the compiler uses for it, which depends on the enumeration
+values. Often the compiler users @code{unsigned int}, which is then encoded as @code{I}.
@item unknown type
@tab @code{?}
@item Complex types
@@ -225,15 +234,16 @@ The types are encoded in the following way:
@c @sp 1
-The encoding of bit-fields has changed to allow bit-fields to be properly
-handled by the runtime functions that compute sizes and alignments of
-types that contain bit-fields. The previous encoding contained only the
-size of the bit-field. Using only this information it is not possible to
-reliably compute the size occupied by the bit-field. This is very
-important in the presence of the Boehm's garbage collector because the
-objects are allocated using the typed memory facility available in this
-collector. The typed memory allocation requires information about where
-the pointers are located inside the object.
+The encoding of bit-fields has changed to allow bit-fields to be
+properly handled by the runtime functions that compute sizes and
+alignments of types that contain bit-fields. The previous encoding
+contained only the size of the bit-field. Using only this information
+it is not possible to reliably compute the size occupied by the
+bit-field. This is very important in the presence of the Boehm's
+garbage collector because the objects are allocated using the typed
+memory facility available in this collector. The typed memory
+allocation requires information about where the pointers are located
+inside the object.
The position in the bit-field is the position, counting in bits, of the
bit closest to the beginning of the structure.
@@ -251,6 +261,8 @@ The non-atomic types are encoded as follows:
@tab @samp{@{} followed by the name of the structure (or @samp{?} if the structure is unnamed), the @samp{=} sign, the type of the members and by @samp{@}}
@item unions
@tab @samp{(} followed by the name of the structure (or @samp{?} if the union is unnamed), the @samp{=} sign, the type of the members followed by @samp{)}
+@item vectors
+@tab @samp{![} followed by the vector_size (the number of bytes composing the vector) followed by a comma, followed by the alignment (in bytes) of the vector, followed by the type of the elements followed by @samp{]}
@end multitable
Here are some types and their encodings, as they are generated by the
@@ -277,6 +289,11 @@ struct @{
@}
@end smallexample
@tab @code{@{?=i[3f]b128i3b131i2c@}}
+@item
+@smallexample
+int a __attribute__ ((vector_size (16)));
+@end smallexample
+@tab @code{![16,16i]} (alignment would depend on the machine)
@end multitable
@sp 1
@@ -300,6 +317,8 @@ Objective-C type specifiers:
@tab @code{o}
@item @code{bycopy}
@tab @code{O}
+@item @code{byref}
+@tab @code{R}
@item @code{oneway}
@tab @code{V}
@end multitable
@@ -310,6 +329,145 @@ The type specifiers are encoded just before the type. Unlike types
however, the type specifiers are only encoded when they appear in method
argument types.
+Note how @code{const} interacts with pointers:
+
+@sp 1
+
+@multitable @columnfractions .25 .75
+@item Objective-C type
+@tab Compiler encoding
+@item
+@smallexample
+const int
+@end smallexample
+@tab @code{ri}
+@item
+@smallexample
+const int*
+@end smallexample
+@tab @code{^ri}
+@item
+@smallexample
+int *const
+@end smallexample
+@tab @code{r^i}
+@end multitable
+
+@sp 1
+
+@code{const int*} is a pointer to a @code{const int}, and so is
+encoded as @code{^ri}. @code{int* const}, instead, is a @code{const}
+pointer to an @code{int}, and so is encoded as @code{r^i}.
+
+Finally, there is a complication when encoding @code{const char *}
+versus @code{char * const}. Because @code{char *} is encoded as
+@code{*} and not as @code{^c}, there is no way to express the fact
+that @code{r} applies to the pointer or to the pointee.
+
+Hence, it is assumed as a convention that @code{r*} means @code{const
+char *} (since it is what is most often meant), and there is no way to
+encode @code{char *const}. @code{char *const} would simply be encoded
+as @code{*}, and the @code{const} is lost.
+
+@menu
+* Legacy type encoding::
+* @@encode::
+* Method signatures::
+@end menu
+
+@node Legacy type encoding
+@subsection Legacy type encoding
+
+Unfortunately, historically GCC used to have a number of bugs in its
+encoding code. The NeXT runtime expects GCC to emit type encodings in
+this historical format (compatible with GCC-3.3), so when using the
+NeXT runtime, GCC will introduce on purpose a number of incorrect
+encodings:
+
+@itemize @bullet
+
+@item
+the read-only qualifier of the pointee gets emitted before the '^'.
+The read-only qualifier of the pointer itself gets ignored, unless it
+is a typedef. Also, the 'r' is only emitted for the outermost type.
+
+@item
+32-bit longs are encoded as 'l' or 'L', but not always. For typedefs,
+the compiler uses 'i' or 'I' instead if encoding a struct field or a
+pointer.
+
+@item
+@code{enum}s are always encoded as 'i' (int) even if they are actually
+unsigned or long.
+
+@end itemize
+
+In addition to that, the NeXT runtime uses a different encoding for
+bitfields. It encodes them as @code{b} followed by the size, without
+a bit offset or the underlying field type.
+
+@node @@encode
+@subsection @@encode
+
+GNU Objective-C supports the @code{@@encode} syntax that allows you to
+create a type encoding from a C/Objective-C type. For example,
+@code{@@encode(int)} is compiled by the compiler into @code{"i"}.
+
+@code{@@encode} does not support type qualifiers other than
+@code{const}. For example, @code{@@encode(const char*)} is valid and
+is compiled into @code{"r*"}, while @code{@@encode(bycopy char *)} is
+invalid and will cause a compilation error.
+
+@node Method signatures
+@subsection Method signatures
+
+This section documents the encoding of method types, which is rarely
+needed to use Objective-C. You should skip it at a first reading; the
+runtime provides functions that will work on methods and can walk
+through the list of parameters and interpret them for you. These
+functions are part of the public ``API'' and are the preferred way to
+interact with method signatures from user code.
+
+But if you need to debug a problem with method signatures and need to
+know how they are implemented (ie, the ``ABI''), read on.
+
+Methods have their ``signature'' encoded and made available to the
+runtime. The ``signature'' encodes all the information required to
+dynamically build invocations of the method at runtime: return type
+and arguments.
+
+The ``signature'' is a null-terminated string, composed of the following:
+
+@itemize @bullet
+
+@item
+The return type, including type qualifiers. For example, a method
+returning @code{int} would have @code{i} here.
+
+@item
+The total size (in bytes) required to pass all the parameters. This
+includes the two hidden parameters (the object @code{self} and the
+method selector @code{_cmd}).
+
+@item
+Each argument, with the type encoding, followed by the offset (in
+bytes) of the argument in the list of parameters.
+
+@end itemize
+
+For example, a method with no arguments and returning @code{int} would
+have the signature @code{i8@@0:4} if the size of a pointer is 4. The
+signature is interpreted as follows: the @code{i} is the return type
+(an @code{int}), the @code{8} is the total size of the parameters in
+bytes (two pointers each of size 4), the @code{@@0} is the first
+parameter (an object at byte offset @code{0}) and @code{:4} is the
+second parameter (a @code{SEL} at byte offset @code{4}).
+
+You can easily find more examples by running the ``strings'' program
+on an Objective-C object file compiled by GCC. You'll see a lot of
+strings that look very much like @code{i8@@0:4}. They are signatures
+of Objective-C methods.
+
@node Garbage Collection
@section Garbage Collection