From ee9dd3721be68b9fa63dea9aa5a1d86e66958cde Mon Sep 17 00:00:00 2001 From: Tom Tromey Date: Wed, 7 Apr 1999 14:42:40 +0000 Subject: Initial revision From-SVN: r26263 --- libjava/doc/cni.sgml | 972 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 972 insertions(+) create mode 100644 libjava/doc/cni.sgml (limited to 'libjava/doc') diff --git a/libjava/doc/cni.sgml b/libjava/doc/cni.sgml new file mode 100644 index 0000000..0255431 --- /dev/null +++ b/libjava/doc/cni.sgml @@ -0,0 +1,972 @@ + +
+ +The Cygnus Native Interface for C++/Java Integration +Writing native Java methods in natural C++ + +Cygnus Solutions + +February, 1999 + + + +This documents CNI, the Cygnus Native Interface, +which is is a convenient way to write Java native methods using C++. +This is a more efficient, more convenient, but less portable +alternative to the standard JNI (Java Native Interface). + + +Basic Concepts + +In terms of languages features, Java is mostly a subset +of C++. Java has a few important extensions, plus a powerful standard +class library, but on the whole that does not change the basic similarity. +Java is a hybrid object-oriented language, with a few native types, +in addition to class types. It is class-based, where a class may have +static as well as per-object fields, and static as well as instance methods. +Non-static methods may be virtual, and may be overloaded. Overloading is +resolved at compile time by matching the actual argument types against +the parameter types. Virtual methods are implemented using indirect calls +through a dispatch table (virtual function table). Objects are +allocated on the heap, and initialized using a constructor method. +Classes are organized in a package hierarchy. + + +All of the listed attributes are also true of C++, though C++ has +extra features (for example in C++ objects may be allocated not just +on the heap, but also statically or in a local stack frame). Because +gcj uses the same compiler technology as +g++ (the GNU C++ compiler), it is possible +to make the intersection of the two languages use the same +ABI (object representation and calling conventions). +The key idea in CNI is that Java objects are C++ objects, +and all Java classes are C++ classes (but not the other way around). +So the most important task in integrating Java and C++ is to +remove gratuitous incompatibilities. + + +You write CNI code as a regular C++ source file. (You do have to use +a Java/CNI-aware C++ compiler, specifically a recent version of G++.) + +You start with: + +#include <cni.h> + + + +You then include header files for the various Java classes you need +to use: + +#include <java/lang/Character.h> +#include <java/util/Date.h> +#include <java/lang/IndexOutOfBoundsException.h> + + + +In general, CNI functions and macros start with the +`Jv' prefix, for example the function +`JvNewObjectArray'. This convention is used to +avoid conflicts with other libraries. +Internal functions in CNI start with the prefix +`_Jv_'. You should not call these; +if you find a need to, let us know and we will try to come up with an +alternate solution. (This manual lists _Jv_AllocBytes +as an example; CNI should instead provide +a JvAllocBytes function.) + +These header files are automatically generated by gcjh. + + + +Packages + +The only global names in Java are class names, and packages. +A package can contain zero or more classes, and +also zero or more sub-packages. +Every class belongs to either an unnamed package or a package that +has a hierarchical and globally unique name. + + +A Java package is mapped to a C++ namespace. +The Java class java.lang.String +is in the package java.lang, which is a sub-package +of java. The C++ equivalent is the +class java::lang::String, +which is in the namespace java::lang, +which is in the namespace java. + + +Here is how you could express this: + +// Declare the class(es), possibly in a header file: +namespace java { + namespace lang { + class Object; + class String; + ... + } +} + +class java::lang::String : public java::lang::Object +{ + ... +}; + + + +The gcjh tool automatically generates the +nessary namespace declarations. + +Nested classes as a substitute for namespaces + + +It is not that long since g++ got complete namespace support, +and it was very recent (end of February 1999) that libgcj +was changed to uses namespaces. Releases before then used +nested classes, which are the C++ equivalent of Java inner classes. +They provide similar (though less convenient) functionality. +The old syntax is: + +class java { + class lang { + class Object; + class String; + }; +}; + +The obvious difference is the use of class instead +of namespace. The more important difference is +that all the members of a nested class have to be declared inside +the parent class definition, while namespaces can be defined in +multiple places in the source. This is more convenient, since it +corresponds more closely to how Java packages are defined. +The main difference is in the declarations; the syntax for +using a nested class is the same as with namespaces: + +class java::lang::String : public java::lang::Object +{ ... } + +Note that the generated code (including name mangling) +using nested classes is the same as that using namespaces. + + +Leaving out package names + + +Having to always type the fully-qualified class name is verbose. +It also makes it more difficult to change the package containing a class. +The Java package declaration specifies that the +following class declarations are in the named package, without having +to explicitly name the full package qualifiers. +The package declaration can be followed by zero or +more import declarations, which allows either +a single class or all the classes in a package to be named by a simple +identifier. C++ provides something similar +with the using declaration and directive. + + +A Java simple-type-import declaration: + +import PackageName.TypeName; + +allows using TypeName as a shorthand for +PackageName.TypeName. +The C++ (more-or-less) equivalent is a using-declaration: + +using PackageName::TypeName; + + + +A Java import-on-demand declaration: + +import PackageName.*; + +allows using TypeName as a shorthand for +PackageName.TypeName +The C++ (more-or-less) equivalent is a using-directive: + +using namespace PackageName; + + + + + +Primitive types + +Java provides 8 primitives types: +byte, short, int, +long, float, double, +char, and boolean. +These are the same as the following C++ typedefs +(which are defined by cni.h): +jbyte, jshort, jint, +jlong, jfloat, +jdouble, +jchar, and jboolean. +You should use the C++ typenames +(e.g. jint), +and not the Java types names +(e.g. int), +even if they are the same. +This is because there is no guarantee that the C++ type +int is a 32-bit type, but jint +is guaranteed to be a 32-bit type. + + + + + +Java type +C/C++ typename +Description + + + +byte +jbyte +8-bit signed integer + + +short +jshort +16-bit signed integer + + +int +jint +32-bit signed integer + + +long +jlong +64-bit signed integer + + +float +jfloat +32-bit IEEE floating-point number + + +double +jdouble +64-bit IEEE floating-point number + + +char +jchar +16-bit Unicode character + + +boolean +jboolean +logical (Boolean) values + + +void +void +no value + + + + + + + +JvPrimClass +primtype + +This is a macro whose argument should be the name of a primitive +type, e.g. +byte. +The macro expands to a pointer to the Class object +corresponding to the primitive type. +E.g., +JvPrimClass(void) +has the same value as the Java expression +Void.TYPE (or void.class). + + + + +Objects and Classes +Classes + +All Java classes are derived from java.lang.Object. +C++ does not have a unique rootclass, but we use +a C++ java::lang::Object as the C++ version +of the java.lang.Object Java class. All +other Java classes are mapped into corresponding C++ classes +derived from java::lang::Object. + +Interface inheritance (the implements +keyword) is currently not reflected in the C++ mapping. + +Object references + +We implement a Java object reference as a pointer to the start +of the referenced object. It maps to a C++ pointer. +(We cannot use C++ references for Java references, since +once a C++ reference has been initialized, you cannot change it to +point to another object.) +The null Java reference maps to the NULL +C++ pointer. + + +Note that in some Java implementations an object reference is implemented as +a pointer to a two-word handle. One word of the handle +points to the fields of the object, while the other points +to a method table. Gcj does not use this extra indirection. + + +Object fields + +Each object contains an object header, followed by the instance +fields of the class, in order. The object header consists of +a single pointer to a dispatch or virtual function table. +(There may be extra fields in front of the object, +for example for +memory management, but this is invisible to the application, and +the reference to the object points to the dispatch table pointer.) + + +The fields are laid out in the same order, alignment, and size +as in C++. Specifically, 8-bite and 16-bit native types +(byte, short, char, +and boolean) are not +widened to 32 bits. +Note that the Java VM does extend 8-bit and 16-bit types to 32 bits +when on the VM stack or temporary registers. + +If you include the gcjh-generated header for a +class, you can access fields of Java classes in the natural +way. Given the following Java class: + +public class Int +{ + public int i; + public Integer (int i) { this.i = i; } + public static zero = new Integer(0); +} + +you can write: + +#include <cni.h> +#include <Int.h> +Int* +mult (Int *p, jint k) +{ + if (k == 0) + return Int::zero; // static member access. + return new Int(p->i * k); +} + + + +CNI does not strictly enforce the Java access +specifiers, because Java permissions cannot be directly mapped +into C++ permission. Private Java fields and methods are mapped +to private C++ fields and methods, but other fields and methods +are mapped to public fields and methods. + + + + +Arrays + +While in many ways Java is similar to C and C++, +it is quite different in its treatment of arrays. +C arrays are based on the idea of pointer arithmetic, +which would be incompatible with Java's security requirements. +Java arrays are true objects (array types inherit from +java.lang.Object). An array-valued variable +is one that contains a reference (pointer) to an array object. + + +Referencing a Java array in C++ code is done using the +JArray template, which as defined as follows: + +class __JArray : public java::lang::Object +{ +public: + int length; +}; + +template<class T> +class JArray : public __JArray +{ + T data[0]; +public: + T& operator[](jint i) { return data[i]; } +}; + + + + template<class T> T *elements + JArray<T> &array + + This template function can be used to get a pointer to the + elements of the array. + For instance, you can fetch a pointer + to the integers that make up an int[] like so: + +extern jintArray foo; +jint *intp = elements (foo); + +The name of this function may change in the future. + +There are a number of typedefs which correspond to typedefs from JNI. +Each is the type of an array holding objects of the appropriate type: + +typedef __JArray *jarray; +typedef JArray<jobject> *jobjectArray; +typedef JArray<jboolean> *jbooleanArray; +typedef JArray<jbyte> *jbyteArray; +typedef JArray<jchar> *jcharArray; +typedef JArray<jshort> *jshortArray; +typedef JArray<jint> *jintArray; +typedef JArray<jlong> *jlongArray; +typedef JArray<jfloat> *jfloatArray; +typedef JArray<jdouble> *jdoubleArray; + + + + You can create an array of objects using this function: + + jobjectArray JvNewObjectArray + jint length + jclass klass + jobject init + + Here klass is the type of elements of the array; + init is the initial + value to be put into every slot in the array. + + +For each primitive type there is a function which can be used + to create a new array holding that type. The name of the function + is of the form + `JvNew<Type>Array', + where `<Type>' is the name of + the primitive type, with its initial letter in upper-case. For + instance, `JvNewBooleanArray' can be used to create + a new array of booleans. + Each such function follows this example: + + jbooleanArray JvNewBooleanArray + jint length + + + + + jsize JvGetArrayLength + jarray array + + Returns the length of array. + + +Methods + + +Java methods are mapped directly into C++ methods. +The header files generated by gcjh +include the appropriate method definitions. +Basically, the generated methods have the same names and +corresponding types as the Java methods, +and are called in the natural manner. + +Overloading + +Both Java and C++ provide method overloading, where multiple +methods in a class have the same name, and the correct one is chosen +(at compile time) depending on the argument types. +The rules for choosing the correct method are (as expected) more complicated +in C++ than in Java, but given a set of overloaded methods +generated by gcjh the C++ compiler will choose +the expected one. + +Common assemblers and linkers are not aware of C++ overloading, +so the standard implementation strategy is to encode the +parameter types of a method into its assembly-level name. +This encoding is called mangling, +and the encoded name is the mangled name. +The same mechanism is used to implement Java overloading. +For C++/Java interoperability, it is important that both the Java +and C++ compilers use the same encoding scheme. + + + +Static methods + +Static Java methods are invoked in CNI using the standard +C++ syntax, using the `::' operator rather +than the `.' operator. For example: + + +jint i = java::lang::Math::round((jfloat) 2.3); + + + +Defining a static native method uses standard C++ method +definition syntax. For example: + +#include <java/lang/Integer.h> +java::lang::Integer* +java::lang::Integer::getInteger(jstring str) +{ + ... +} + + + +Object Constructors + +Constructors are called implicitly as part of object allocation +using the new operator. For example: + +java::lang::Int x = new java::lang::Int(234); + + + + +Java does not allow a constructor to be a native method. +Instead, you could define a private method which +you can have the constructor call. + + + +Instance methods + + +Virtual method dispatch is handled essentially the same way +in C++ and Java -- i.e. by doing an +indirect call through a function pointer stored in a per-class virtual +function table. C++ is more complicated because it has to support +multiple inheritance, but this does not effect Java classes. +However, G++ has historically used a different calling convention +that is not compatible with the one used by gcj. +During 1999, G++ will switch to a new ABI that is compatible with +gcj. Some platforms (including Linux) have already +changed. On other platforms, you will have to pass +the -fvtable-thunks flag to g++ when +compiling CNI code. + + +Calling a Java instance method in CNI is done +using the standard C++ syntax. For example: + + java::lang::Number *x; + if (x->doubleValue() > 0.0) ... + + + +Defining a Java native instance method is also done the natural way: + +#include <java/lang/Integer.h> +jdouble +java::lang:Integer::doubleValue() +{ + return (jdouble) value; +} + + + + +Interface method calls + +In Java you can call a method using an interface reference. +This is not yet supported in CNI. + + + +Object allocation + + +New Java objects are allocated using a +class-instance-creation-expression: + +new Type ( arguments ) + +The same syntax is used in C++. The main difference is that +C++ objects have to be explicitly deleted; in Java they are +automatically deleted by the garbage collector. +Using CNI, you can allocate a new object +using standard C++ syntax. The C++ compiler is smart enough to +realize the class is a Java class, and hence it needs to allocate +memory from the garbage collector. If you have overloaded +constructors, the compiler will choose the correct one +using standard C++ overload resolution rules. For example: + +java::util::Hashtable *ht = new java::util::Hashtable(120); + + + + + void *_Jv_AllocBytes + jsize size + + Allocate size bytes. This memory is not + scanned by the garbage collector. However, it will be freed by +the GC if no references to it are discovered. + + + +Interfaces + +A Java class can implement zero or more +interfaces, in addition to inheriting from +a single base class. +An interface is a collection of constants and method specifications; +it is similar to the signatures available +as a G++ extension. An interface provides a subset of the +functionality of C++ abstract virtual base classes, but they +are currently implemented differently. +CNI does not currently provide any support for interfaces, +or calling methods from an interface pointer. +This is partly because we are planning to re-do how +interfaces are implemented in gcj. + + + +Strings + +CNI provides a number of utility functions for +working with Java String objects. +The names and interfaces are analogous to those of JNI. + + + + + jstring JvNewString + const jchar *chars + jsize len + + Creates a new Java String object, where + chars are the contents, and + len is the number of characters. + + + + + jstring JvNewStringLatin1 + const char *bytes + jsize len + + Creates a new Java String object, where bytes + are the Latin-1 encoded + characters, and len is the length of + bytes, in bytes. + + + + + jstring JvNewStringLatin1 + const char *bytes + + Like the first JvNewStringLatin1, but computes len + using strlen. + + + + + jstring JvNewStringUTF + const char *bytes + + Creates a new Java String object, where bytes are + the UTF-8 encoded characters of the string, terminated by a null byte. + + + + + jchar *JvGetStringChars + jstring str + + Returns a pointer to the array of characters which make up a string. + + + + + int JvGetStringUTFLength + jstring str + + Returns number of bytes required to encode contents + of str as UTF-8. + + + + + jsize JvGetStringUTFRegion + jstring str + jsize start + jsize len + char *buf + + This puts the UTF-8 encoding of a region of the + string str into + the buffer buf. + The region of the string to fetch is specifued by + start and len. + It is assumed that buf is big enough + to hold the result. Note + that buf is not null-terminated. + + + +Class Initialization + +Java requires that each class be automatically initialized at the time +of the first active use. Initializing a class involves +initializing the static fields, running code in class initializer +methods, and initializing base classes. There may also be +some implementation specific actions, such as allocating +String objects corresponding to string literals in +the code. + +The Gcj compiler inserts calls to JvInitClass (actually +_Jv_InitClass) at appropriate places to ensure that a +class is initialized when required. The C++ compiler does not +insert these calls automatically - it is the programmer's +responsibility to make sure classes are initialized. However, +this is fairly painless because of the conventions assumed by the Java +system. + +First, libgcj will make sure a class is initialized +before an instance of that object is created. This is one +of the responsibilities of the new operation. This is +taken care of both in Java code, and in C++ code. (When the G++ +compiler sees a new of a Java class, it will call +a routine in libgcj to allocate the object, and that +routine will take care of initializing the class.) It follows that you can +access an instance field, or call an instance (non-static) +method and be safe in the knowledge that the class and all +of its base classes have been initialized. + +Invoking a static method is also safe. This is because the +Java compiler adds code to the start of a static method to make sure +the class is initialized. However, the C++ compiler does not +add this extra code. Hence, if you write a native static method +using CNI, you are responsible for calling JvInitClass +before doing anything else in the method (unless you are sure +it is safe to leave it out). + +Accessing a static field also requires the class of the +field to be initialized. The Java compiler will generate code +to call _Jv_InitClass before getting or setting the field. +However, the C++ compiler will not generate this extra code, +so it is your responsibility to make sure the class is +initialized before you access a static field. + +Exception Handling + +While C++ and Java share a common exception handling framework, +things are not quite as integrated as we would like, yet. +The main issue is the incompatible exception values, +and that the run-time type information facilities of the +two languages are not integrated. + +Basically, this means that you cannot in C++ catch an exception +value (Throwable) thrown from Java code, nor +can you use throw on a Java exception value from C++ code, +and expect to be able to catch it in Java code. +We do intend to change this. + +You can throw a Java exception from C++ code by using +the JvThrow CNI function. + + void JvThrow + jobject obj + + Throws an exception obj, in a way compatible +with the Java exception-handling functions. + The class of obj must be a subclass of + Throwable. + + +Here is an example: + +if (i >= count) + JvThrow (new java::lang::IndexOutOfBoundsException()); + + + + +Synchronization + +Each Java object has an implicit monitor. +The Java VM uses the instruction monitorenter to acquire +and lock a monitor, and monitorexit to release it. +The JNI has corresponding methods MonitorEnter +and MonitorExit. The corresponding CNI macros +are JvMonitorEnter and JvMonitorExit. + + +The Java source language does not provide direct access to these primitives. +Instead, there is a synchronized statement that does an +implicit monitorenter before entry to the block, +and does a monitorexit on exit from the block. +Note that the lock has to be released even the block is abnormally +terminated by an exception, which means there is an implicit +try-finally. + + +From C++, it makes sense to use a destructor to release a lock. +CNI defines the following utility class. + +class JvSynchronize() { + jobject obj; + JvSynchronize(jobject o) { obj = o; JvMonitorEnter(o); } + ~JvSynchronize() { JvMonitorExit(obj); } +}; + +The equivalent of Java's: + +synchronized (OBJ) { CODE; } + +can be simply expressed: + +{ JvSynchronize dummy(OBJ); CODE; } + + + +Java also has methods with the synchronized attribute. +This is equivalent to wrapping the entire method body in a +synchronized statement. +(Alternatively, an implementation could require the caller to do +the synchronization. This is not practical for a compiler, because +each virtual method call would have to test at run-time if +synchronization is needed.) Since in gcj +the synchronized attribute is handled by the +method implementation, it is up to the programmer +of a synchronized native method to handle the synchronization +(in the C++ implementation of the method). +In otherwords, you need to manually add JvSynchronize +in a native synchornized method. + + +Reflection +The types jfieldID and jmethodID +are as in JNI. + +The function JvFromReflectedField, +JvFromReflectedMethod, +JvToReflectedField, and +JvToFromReflectedMethod (as in Java 2 JNI) +will be added shortly, as will other functions corresponding to JNI. + +Using gcjh + + The gcjh is used to generate C++ header files from + Java class files. By default, gcjh generates + a relatively straightforward C++ header file. However, there + are a few caveats to its use, and a few options which can be + used to change how it operates: + + + +--classpath path +--CLASSPATH path +-I dir + + These options can be used to set the class path for gcjh. + Gcjh searches the class path the same way the compiler does; + these options have their familiar meanings. + + + + +-d directory + +Puts the generated .h files +beneath directory. + + + + +-o file + + Sets the name of the .h file to be generated. + By default the .h file is named after the class. + This option only really makes sense if just a single class file + is specified. + + + + +--verbose + + gcjh will print information to stderr as it works. + + + + +-M +-MM +-MD +-MMD + + These options can be used to generate dependency information + for the generated header file. They work the same way as the + corresponding compiler options. + + + + +-prepend text + +This causes the text to be put into the generated + header just after class declarations (but before declaration + of the current class). This option should be used with caution. + + + + +-friend text + +This causes the text to be put into the class +declaration after a friend keyword. +This can be used to declare some + other class or function to be a friend of this class. + This option should be used with caution. + + + + +-add text + +The text is inserted into the class declaration. +This option should be used with caution. + + + + +-append text + +The text is inserted into the header file +after the class declaration. One use for this is to generate +inline functions. This option should be used with caution. + + + + +All other options not beginning with a - are treated +as the names of classes for which headers should be generated. + +gcjh will generate all the required namespace declarations and +#include's for the header file. +In some situations, gcjh will generate simple inline member functions. + +There are a few cases where gcjh will fail to work properly: + +gcjh assumes that all the methods and fields of a class have ASCII +names. The C++ compiler cannot correctly handle non-ASCII +identifiers. gcjh does not currently diagnose this problem. + +gcjh also cannot fully handle classes where a field and a method have +the same name. If the field is static, an error will result. +Otherwise, the field will be renamed in the generated header; `__' +will be appended to the field name. + +Eventually we hope to change the C++ compiler so that these +restrictions can be lifted. + + +
-- cgit v1.1