.ds lang COBOL .ds gcobol GCC\ \*[lang]\ Front-end .ds isostd ISO/IEC 1989:2023 .Dd \& February 2025 .Dt GCOBOL 1\& "GCC \*[lang] Compiler" .Os Linux .Sh NAME .Nm gcobol .Nd \*[gcobol] .Sh SYNOPSIS .Nm .Op Fl D Ns Ar name Ns Oo Li = Ns Ar value Oc .Op Fl E .Op Fl fdefaultbyte Ns Li = Ns Ar value .Op Fl fsyntax-only .Op Fl I Ns Ar copybook-path .Op Fl fmax-errors Ns Li = Ns Ar nerror .Oo .Fl nomain | .Fl main Ar filename | .Fl main Ns Li = Ns Ar filename .Fl main Ns Li = Ns Ar filename:program-id .Oc .Op Fl fcobol-exceptions Ar exception Ns Op Ns \/, Ns Ar exception Ns ... .Op Fl copyext Ar ext .Op Fl ffixed-form | Fl ffree-form .Op Fl findicator-column .Op Fl finternal-ebcdic .Op Fl dialect Ar dialect-name .Op Fl include Ar filename .Op Fl preprocess Ar preprocess-filter .Op Fl fflex-debug .Op Fl fyacc-debug .Ar filename Op ... . .Sh DESCRIPTION .Nm compiles \*[lang] source code to object code, and optionally produces an executable binary or shared object. As a GCC component, it accepts all options that affect code-generation and linking. Options specific to \*[lang] are listed below. .Bl -tag -width \0\0debug .It Fl main Ar filename .Nm will generate a .Fn main function as an entry point calling the first PROGRAM-ID in .Ar filename . .Pp .Fl main is the default. When none of .Fl nomain , .Fl c , or .Fl shared , is present, an implicit .Fl main is inserted into the command line ahead of the first source file name. .It Fl main Ns Li = Ns Ar filename The .o object module for .Ar filename will include a .Fn main entry point calling the first PROGRAM-ID in .Ar filename .It Fl main Ns Li = Ns Ar filename:program-id The .o object module for .Ar filename will include a .Fn main entry point that calls the .Ar program-id entry point .It Fl nomain No .Fn main entry point will be generated by this compilation. The .Fl nomain option is incompatible with .Fl main , and is implied by .Fl shared . It is also implied by .Fl c when there is no .Fl main present. .Pp See below for examples showing the use of .Fl main and .Fl nomain. .It Fl D Ar name Ns Op Li = Ns Ar expr Define a CDF name (for use with .Sy >>IF ) to have the value of .Ar expr . .It Fl E Write the CDF-processed \*[lang] input to standard output in free-form reference format. Certain non-\*[lang] markers are included in the output to indicate where copybook files were included. For line-number consistency with the input, blank lines are retained. .Pp Unlike the C compiler, This option does not prevent compilation. To prevent compilation, use the option .D1 Fl Sy fsyntax-only also. .It Fl fdefaultbyte Ns Li = Ns Ar value Use .Ar value , a number between 0 and 255, as the default value for all WORKING-STORAGE data items that have no VALUE clause. By default, alphanumeric data items are initialized with blanks, and numeric data items are initialized to zero. This option overrides the default with .Ar value . .It Fl fsyntax-only Invoke only the parser. Check the code for syntax errors, but don't do anything beyond that. .It Fl copyext Ar ext For the CDF directive .D1 COPY Ar name if .Ar name is unquoted, several varieties of .Ar name are tried, as described below under .Xr Copybooks Ns . The .Fl copyext option extends the names searched to include .Ar ext . If .Ar ext is all uppercase or all lowercase, both forms are tried, with preference given to the one supplied. If .Ar ext is mixed-case, only that version is tried. For example, with .D1 Fl copyext Ar .abc given the CDF directive .D1 COPY name .Nm will add to possible names searched .Ql name.abc and .Ql name.ABC in that order. .It Fl ffixed-form Use strict .Em "Reference Format" in reading the \*[lang] input: 72-character lines, with a 6-character sequence area, and an indicator column. Data past column 72 are ignored. .It Fl ffree-form Force the \*[lang] input to be interpreted as .Em "free format" . Line breaks are insignificant, except that .Ql * at the start of a line acts as a comment marker. Equivalent to .Fl indicator-column Ar 0 Ns Li . . .It Fl findicator-column describes the location of the Indicator Area in a \*[lang] file in .Em "Reference Format" , where the first 6 columns \(em known as the .Dq "Sequence Number Area" \(em are ignored, and the 7th column \(em the Indicator Area \(em may hold a character of significance to the compiler. .Pp Although .Em "reference format" , strictly speaking, ignores data after column 72, with this option .Nm accepts long \*[lang] lines, sometimes known as .Em "extended source format" . Text past column 72 is treated as ordinary \*[lang] text. (Line continuation remains in effect, however, provided no text appears .Em past column 72.) .Pp There is no maximum line length. Regardless of source code format, the entire program could appear on one line. .Pp By default, .Nm auto-detects the source code format by examining the .Em "sequence number area" of the first line of the first file: if those characters are all digits or blanks, the file is assumed to be in .Em "reference format" , with the indicator area in column 7. .Pp . .It Fl fcobol-exceptions Ar exception Op Ns , Ns Ar exception Ns ... By default, no exception condition is enabled (including fatal ones), and by the ISO standard exception conditions are enabled only via the CDF .Sy "TURN" directive. This option enables one or more exception conditions by default, as though .Sy TURN had appeared at the top of the first source code file. This option may also appear more than once on the command line. .Pp The value of .Ar exception is a Level 1, 2, or 3 exception condition name, as described by \*[isostd]. .Ql EC-ALL means enable all exceptions. .Pp The .Fl fno-cobol-exceptions form turns off .Ar exception , just as though .D1 >>TURN Ar exception CHECKING OFF had appeared. .Pp Not all exception conditions are implemented. Any that are not produce a warning message. . .It Fl fmax-errors Ar nerror .Ar nerror represents the number of error messages produced. Without this option, .Nm attempts to recover from a syntax error by resuming compilation at the next statement, continuing until end-of-file. With it, .Nm counts the messages as they're produced, and stops when .Ar nerror is reached. .It Fl fstatic-call Ns , Fl fno-static-call With .Fl fno-static-call , .Nm never uses static linking for .D1 Sy CALL Ar program By default, or with .Fl fstatic-call , if .Ar program is an alphanumeric literal, .Nm uses static linkage, meaning the compiler produces an external symbol .Ar program for the linker to resolve. (In the future, that will work with .Sy CONSTANT data items, too.) With static linkage, if .Ar program is not supplied by the source code module or another object file or library at build time, the linker will produce an .Dq "unresolved symbol" error. With .Fl fno-static-call , .Nm always uses dynamic linking. .Pp This option affects the .Sy CALL statement for literals only. If .Ar program is a non-constant data item, it is always resolved using dynamic linking, with .Xr dlsym 3 Ns Li , because its value is determined at run time. .It Fl dialect Ar dialect-name By default, .Nm accepts \*[lang] syntax as defined by \*[isostd], with some extensions for backward compatibility with COBOL-85. To make the compiler more generally useful, some additional syntax is supported by this option. .Pp The value of .Ar dialect-name may be .Bl -tag -compact .It ibm to indicate IBM COBOL 6.3 syntax, specifically .D1 STOP . .It gnu to indicate GnuCOBOL syntax .It mf to indicate MicroFocus syntax, specifically .Sy LEVEL 78 constants. .El .Pp Only a few such non-standard constructs are accepted, and .Nm makes no claim to emulate other compilers. But to the extent that a feature is popular but nonstandard, this option provides a way to support it, or add it. . .It Fl include Ar filename Process .Ar filename as if .D1 COPY Dq Ar filename appeared as the first line of the primary source file. If .Ar filename is not an absolute path, the directory searched is the current working directory, not the directory containing the main source file. The name is used verbatim. No permutations are applied, and no directories searched. .Pp If multiple .Fl include options are given, the files are included in the order they appear on the command line. . .It Fl preprocess Ar preprocess-filter After all CDF text-manipulation has been applied, and before the prepared \*[lang] is sent to the .Sy cobol1 compiler, the input may be further altered by one or more filters. In the tradition of .Xr sed 1 , each .Ar preprocess-filter reads from standard input and writes to standard output. .Pp To supply options to .Ar preprocess-filter , use a comma-separated string, similar to how linker options are supplied to .Fl Sy Wl . (Do not put any spaces after the commas, because the shell will treat it as an option separator.) .Nm replaces each comma with a space when .Ar preprocess-filter is invoked. For example, .D1 Fl preprocess Li tee,output.cbl invokes .Xr tee 1 with the output filename argument .Pa output.cbl , causing a copy of the input to be written to the file. .Pp .Nm searches the current working directory and the PATH environment variable directories for an executable file whose name matches .Ar preprocess-filter . The first one found is used. If none is found, an error is reported and the compiler is not invoked. .Pp The .Fl preprocess option may appear more than once on the command line. Each .Ar preprocess-filter is applied in turn, in order of appearance. .Pp The .Ar preprocess-filter should return a zero exit status, indicating success. If it returns a nonzero exit status, an error is reported and the compiler is not invoked. . .It Fl fflex-debug Ns Li , Fl fyacc-debug produce messages useful for compiler development. The .Fl fflex-debug option prints the tokenized input stream. The .Fl fyacc-debug option shows the shift and reduce actions taken by the parser. .El . .Sh COMPILATION SCENARIOS .D1 gcobol Ar xyz.cob .D1 gcobol -main Ar xyz.cob .D1 gcobol -main= Ns Ar xyz.cob Ar xyz.cob These are equivalent. The .Ar xyz.cob code is compiled and a .Fn main function is inserted that calls the first PROGRAM-ID in the .Ar xyz.cob source file. .Pp .D1 gcobol -nomain Ar xyz.cob Ar elsewhere.o The .Fl nomain option prevents a .Fn main function from being generated by the gcobol compiler. A .Fn main entry point must be present in the .Ar elsewhere.o module; without it the linker will report a .Dq "missing main" error. .Pp .D1 gcobol Ar aaa.cob Ar bbb.cob Ar ccc.cob .D1 gcobol -main Ar aaa.cob Ar bbb.cob Ar ccc.cob The two commands are equivalent. The three source code modules are compiled and linked together along with a generated .Fn main function that calls the first PROGRAM-ID in the .Ar aaa.cob module. .Pp .D1 gcobol Ar aaa.cob Ar bbb.cob Fl main Ar ccc.cob .D1 gcobol -main Ns = Ns Ar ccc.cob Ar aaa.cob Ar bbb.cob Ar ccc.cob These two commands have the same result: An .Ar a.out executable is created that starts executing at the first PROGRAM-ID in .Ar ccc.cob . .Pp .D1 gcobol -main Ns = Ns Ar bbb.cob:b-entry Ar aaa.cob Ar bbb.cob Ar ccc.cob An .Ar a.out executable is created that starts executing at the PROGRAM-ID .Ar "b-entry" . .Pp .D1 gcobol -c Ar aaa.cob .D1 gcobol -c -main Ar bbb.cob .D1 gcobol -c Ar ccc.cob .D1 gcobol Ar aaa.o Ar bbb.o Ar ccc.o The first three commands each create a .o file. The .Ar bbb.o file will contain a .Fn main entry point that calls the first PROGRAM-ID in .Ar bbb . The fourth links the three .o files into an .Ar a.out . . .Sh EBCDIC The .Fl finternal-ebcdic option is useful when working with mainframe \*[lang] programs intended for EBCDIC-encoded files. With this option, while the \*[lang] text remains in ASCII, the character literals and field initial values produce EBCDIC strings in the compiled binary, and any character data read from a file are interpreted as EBCDIC data. The file data are not .Em converted ; rather, the file is assumed to use EBCDIC representation. String literals in the \*[lang] text .Em are converted, so that they can be compared meaningfully with data in the file. .Pp Only file data and character literals are affected. Data read from and written to the environment, or taken from the command line, are interpreted according the .Xr locale 7 in force during execution. The same is true of .Sy ACCEPT and .Sy DISPLAY . Names known to the operating system, such as file names and the names of environment variables, are processed verbatim. .Pp At the present time, this is an all-or-nothing setting. Support for .Sy USAGE and .Sy CODESET , which would allow conversion between encodings, remains a future goal. .Pp See also .Sx "Feature-set Variables" , below. . .Sh REDEFINES ... USAGE POINTER Per ISO, an item that .Sy REDEFINES another may not be larger than the item it redefines, unless that item has LEVEL 01 and is not EXTERNAL. In .Nm , using .Fl dialect Ar ibm , this rule is relaxed for .Sy REDEFINES with .Sy USAGE POINTER whose redefined member is a 4-byte .Sy USAGE COMP-5 (usually .Sy PIC S9(8) Ns ), or vice-versa. In that case, the redefined member is re-sized to be 8 bytes, to accommodate the pointer. This feature allows pointer arithmetic on a 64-bit system with source code targeted at a 32-bit system. .Pp See also .Sx "Feature-set Variables" , below. . .Sh IMPLEMENTATION NOTES .Nm is a gcc compiler, and follows gcc conventions where applicable. Sometimes those conventions (and user expectations) conflict with common Mainframe practice. Unless required of the compiler by the ISO specification, any such conflicts are resolved in favor of gcc. .Ss Linking Unlike, C, the \*[lang] .Sy CALL statement implies dynamic linking, because for .D1 Sy CALL Ar program .Ar program can be a variable whose value is determined at runtime. However, the parameter may also be compile-time constant, either an alphanumeric literal, or a .Sy CONSTANT data item. .Pp .Nm supports static linking where possible, unless defeated by .Fl fno-static-call . If the parameter value is known at compile time, the compiler produces an external reference to be resolved by the linker. The referenced program is normally supplied via an object module, a static library, or a shared object. If it is not supplied, the linker will report an .Dq "unresolved symbol" error, either at build time or, if using a shared object, when the program is executed. This feature informs the programmer of the error at the earliest opportunity. .Pp Programs that are expected to execute correctly in the presence of an unresolved symbol (perhaps because the program logic won't require that particular .Sy CALL ) can use the .Fl no-static-call option. That forces all .Sy CALL statements to be resolved dynamically, at runtime. .ig Programs that are expected to execute correctly in the presence of an unresolved symbol (perhaps because the program logic won't require that particular .Sy CALL ) can use linker options to produce an executable anyway. .Pp One corner case yet remains. The .Sy CALL statement includes an .Sy "ON ERROR" clause whose purpose is to handle errors arising when the called program is not found. Control is transferred to the .Sy "ON ERROR" clause when the .Sy EC-PROGRAM-NOT-FOUND exception condition is raised. That exception condition is not raised in .Nm when: .Bl -bullet -compact .It the .Sy CALL parameter is known at compile time, i.e., is an alphanumeric literal or .Sy CONSTANT data item, and .It the executable was generated with the linker option to ignore unresolved symbols. .El In that case, the program is terminated with a signal. No recovery with .Sy "ON ERROR" is possible. .Pp Should your program meet those particular conditions, all is not lost. There are workarounds, and an option could be added to use dynamic linking for all .Sy CALL statement, regardless of compile-time constants. .. . .Ss Implemented Exception Conditions Not all Exception Conditions are implemented. Any attempt to enable an EC that that is not implemented produces a warning message. The following are implemented: .Pp .Bl -tag -offset 5n -compact .It EC-FUNCTION-ARGUMENT for the following functions: .Bl -item -compact .It ACOS .It ANNUITY .It ASIN .It LOG .It LOG10 .It PRESENT-VALUE .It SQRT .El .It EC-SORT-MERGE-FILE-OPEN .It EC-BOUND-SUBSCRIPT subscript not an integer, less than 1, or greater than occurs .It EC-BOUND-REF-MOD refmod start not an integer, start less than 1, start greater than variable size, length not an integer, length less than 1, and start+length exceeds variable size .It EC-BOUND-ODO DEPENDING not an integer, greater than occurs upper limit, less than occurs lower limit, and subscript greater than DEPENDING for sending item .It EC-SIZE-ZERO-DIVIDE for both fixed-point and floating-point division .It EC-SIZE-TRUNCATION .It EC-SIZE-EXPONENTIATION .El .Pp As of this writing, no \*[lang] compiler documents a complete implementation of \*[isostd] Exception Conditions. .Nm will give priority to those ECs that the user community deems most valuable. . .Sh EXTENSIONS TO ISO \*[lang] Standard \*[lang] has no provision for environment variables as defined by Unix and Windows, or command-line arguments. .Nm supports them using syntax similar to that of GnuCOBOL. ISO and IBM also define incompatible ways to return the program's exit status to the operating system. .Nm supports IBM syntax. . .Ss Environment Variables To read an environment variable: .Pp .D1 ACCEPT Ar target Li FROM ENVIRONMENT Ar envar .Pp where .Ar target is a data item defined in .Sy "DATA DIVISION" , and .Ar envar names an environment variable. .Ar envar may be a string literal or alphanumeric data item whose value is the name of an environment variable. The value of the named environment variable is moved to .Ar target . The rules are the same as for .Sy MOVE . .Pp To write an environment variable: .Pp .D1 SET ENVIRONMENT Ar envar Li TO Ar source .Pp where .Ar source is a data item defined in .Sy DATA DIVISION , and .Ar envar names an environment variable. .Ar envar again may be a string literal or alphanumeric data item whose value is the name of an environment variable. The value of the named environment variable is set to the value of .Ar source . . .Ss Command-line Arguments To read command-line arguments, use the registers .Sy COMMAND-LINE and .Sy COMMAND-LINE-COUNT in an .Sy ACCEPT statement (only). Used without a subscript, .Sy COMMAND-LINE returns the whole command line as a single string. With a subscript, .Sy COMMAND-LINE is a table of command-line arguments. For example, if the program is invoked as .sp .D1 Sy ./program Fl i Ar input Ar output .sp then .sp .D1 ACCEPT target FROM COMMAND-LINE(3) .sp moves .Ar input into .Ar target . The program name is the first thing in the whole command line and is found in COMMAND-LINE(1) .Sy COMMAND-LINE table. .Pp To discover how many arguments were provided on the command line, use .sp .D1 ACCEPT Ar target Li FROM COMMAND-LINE-COUNT .sp If .Sy ACCEPT refers to a nonexistent environment variable or command-line argument, the target is set to .Sy LOW-VALUES . .Pp The system command line parameters can also be accessed through the LINKAGE SECTION in the program where execution starts. The data structure looks like this: .Bd -literal linkage section. 01 argc pic 999. 01 argv. 02 argv-table occurs 1 to 100 times depending on argc. 03 argv-element pointer. 01 argv-string pic x(100) . .Ed and the code to access the third parameter looks like this .Bd -literal procedure division using by value argc by reference argv. set address of argv-string to argv-element(3) display argv-string .Ed . .Ss #line directive The parser accepts lines in the form .D1 #line Ar lineno Dq Ar filename Ns . The effect is to set the current line number to .Ar lineno and the current input filename to .Ar filename . Preprocessors may use this directive to control the filename and line numbers reported in error messages and in the debugger. . .Ss SELECT ... ASSIGN TO In the phrase .sp .D1 ASSIGN TO Ar filename .sp .Ar filename may appear in quotes or not. If quoted, it represents a filename as known to the operating system. If unquoted, it names either a data element or an environment variable containing the name of a file. If .Ar filename matches the name of a data element, that element is used. If not, resolution of .Ar filename is deferred until runtime, when the name must appear in the program's environment. . .Sh ISO \*[lang] Implementation Status .Ss USAGE Data Types .Nm supports the following .Sy USAGE IS clauses: .Bl -tag -compact -width POINTER\0 .It Sy INDEX for use as an index in a table. .It Sy POINTER for variables whose value is the address of an external function, .Sy PROGRAM-ID , or data item. Assignment is via the .Sy SET statement. .It Sy BINARY, Sy COMP , Sy COMPUTATIONAL, Sy COMP-4, Sy COMPUTATIONAL-4 big-endian integer, 1 to 16 bytes, per PICTURE. .It Sy COMP-1 , Sy COMPUTATIONAL-1 , Sy FLOAT-BINARY-32 IEEE 754 single-precision (4-byte) floating point, as provided by the hardware. .It Sy COMP-2 , Sy COMPUTATIONAL-2 , Sy FLOAT-BINARY-64 IEEE 754 double-precision (8-byte) floating point, as provided by the hardware. .It Sy COMP-3 , Sy COMPUTATIONAL-3, Sy PACKED-DECIMAL currently unimplemented. .It Sy COMP-5 , Sy COMPUTATIONAL-5 little-endian integer, 1 to 16 bytes, per .Sy PICTURE. .It Sy FLOAT-BINARY-128 , FLOAT-EXTENDED implements 128-bit floating point, per IEEE 754. .El .Pp .Nm supports ISO integer .Sy BINARY- types, most of which alias .Sy COMP-5. . .hw unsigned .sp .TS LB LB LB LB LB LB LB LB L L L L . COMP-5 Compatible Picture BINARY Type Bytes Value T{ BINARY-CHAR [UNSIGNED] T} 1 0 \(em 256 S9(1...4) T{ BINARY-CHAR SIGNED T} 1 -128 \(em +127 \09(1...4) T{ BINARY-SHORT [UNSIGNED] T} 2 0 \(em 65535 S9(1...4) T{ BINARY-SHORT SIGNED T} 2 -32768 \(em +32767 \09(5...9) T{ BINARY-LONG [UNSIGNED] T} 4 0 \(em 4,294,967,295 S9(5...9) T{ BINARY-LONG SIGNED T} 4 T{ -2,147,483,648 \(em +2,147,483,647 T} \09(10...18) T{ BINARY-LONG-LONG [UNSIGNED] T} 8 T{ 0 \(em 18,446,744,073,709,551,615 T} S9(10...18) T{ BINARY-LONG-LONG SIGNED T} 8 T{ -9,223,372,036,854,775,808 \(em +9,223,372,036,854,775,807 T} .TE .Pp These define a size (in bytes) and cannot be used with a .Sy PICTURE clause. Per the ISO standard, .Sy SIGNED is the default for the .Sy "BINARY-" Ns Ar type aliases. .Pp All computation \(em both integer and floating point \(em is done using 128-bit intermediate forms. . .Ss Environment Names In .Nm .sp .Dl DISPLAY UPON .sp maps .Sy SYSOUT and .Sy STDOUT to standard output, and .Sy SYSPUNCH , .Sy SYSPCH and .Sy STDERR to standard error. . .Ss Exit Status .Nm supports the ISO syntax for returning an exit status to the operating system, .Pp .D1 STOP RUN Oo WITH Oc Bro NORMAL | ERROR Brc Oo STATUS Oc Ar status .Pp In addition, .Nm also supports the IBM syntax for returning an exit status to the operating system. Use the .Sy RETURN-CODE register: .Bd -literal -offset indent MOVE ZERO TO RETURN-CODE. GOBACK. .Ed .Pp The .Sy RETURN-CODE register is defined as a 4-byte binary integer. .ig .Pp The ISO standard supports an extended form of .Sy GOBACK : .Pp .D1 GOBACK {ERROR | NORMAL} WITH Ar status .Pp where .Ar status is a numeric data item or literal. This syntax has the same effect as: .Bd -literal -offset indent MOVE status TO RETURN-CODE. GOBACK. .Ed The use of .Sy ERROR or .Sy NORMAL has no effect; the two are interchangeable. .. . .Ss Compiler-Directing Facility (CDF) The CDF should be used with caution because no comprehensive test suite has been identified. . .Ss Conditional Compilation .Bl -tag -width >>DEFINE .It >> Ns Sy DEFINE Ar name Sy AS Bro Ar expression Li | Sy PARAMETER Brc Op Sy OVERRIDE Define .Ar name as a compilation variable to have the value .Ar expression . If .Ar name was previously defined, .Sy OVERRIDE is required, else the directive is invalid. .Sy AS PARAMETER is accepted, but has no effect in .Nm . . .It >> Ns Sy DEFINE Ar name AS Sy OFF releases the definition .Ar name , making it subsequently invalid for use. .\" ISO requires AS; cdf.y does not. . .It >> Ns Sy IF Ar cce Ar text Oo >> Ns Sy ELSE Ar alt-text Oc Li >> Ns Sy END-IF evaluates .Ar cce , a .Em "constant conditional expression\/" , for conditional compilation. If a name, .Ar cce may be defined with the .Fl D command-line parameter. If true, the \*[lang] text .Ar text is compiled. If false, .Ar else-text , if present, is compiled. .Bo Sy IS Bo Sy NOT Bc Bc Sy DEFINED is supported. Boolean literals are not supported. . .It >> Ns Sy EVALUATE Not implemented. .El . .Ss Other CDF Directives .Bl -tag -width >>PROPAGATE .It >> Ns Sy CALL-CONVENTION Ar convention .Ar convention may be one of: .Bl -tag -compact .It Sy \*[lang] Use standard \*[lang] case-insensitive symbol-name matching. For .Sy CALL Dq Ar name , .Ar name is rendered by the compiler in lowercase. .It Sy C Use case-sensitive symbol-name matching. The .Sy CALL target is not changed in any way; it is used verbatim. .It Sy VERBATIM An alias for >>\c .Sy "CALL-CONVENTION C" . .El .It >> Ns Sy COBOL-WORDS EQUATE Ar keyword Sy WITH Ar alias makes .Ar alias a synonym for .Ar keyword . .It >> Ns Sy COBOL-WORDS UNDEFINE Ar keyword .Ar keyword is removed from the \*[lang] grammar. Use of it in a program will provoke a syntax error from the compiler. .It >> Ns Sy COBOL-WORDS SUBSTITUTE Ar keyword Sy BY Ar new-word .Ar keyword is deleted as a keyword from the grammar, replaced by .Ar new-word . .Ar keyword may thereafter be used as a user-defined word. .It >> Ns Sy COBOL-WORDS RESERVE Ar new-word Treat .Ar new-word as a \*[lang] keyword. It cannot be used by the program, either as a keyword or as a user-defined word. . .It >> Ns Sy DISPLAY Ar string ... Write .Ar string to standard error as a warning message. .It >> Ns Sy SOURCE Ar format .Ar format may be one of: .Bl -tag -compact .It Sy FIXED Source conforms to \*[lang] Reference Format with unlimited line length. .It Sy FREE Line endings and indentation are ignored by the compiler, except that a .Ql "*" at the beginning of a line is recognized as a comment. .El .El .Pp .Bl -tag -width >>PROPAGATE -compact .It >> Ns Sy FLAG-02 Not implemented. .It >> Ns Sy FLAG-85 Not implemented. .It >> Ns Sy FLAG-NATIVE-ARITHMETIC Not implemented. .It >> Ns Sy LEAP-SECOND Not implemented. .It >> Ns Sy LISTING Not implemented. .It >> Ns Sy PAGE Not implemented. .It >> Ns Sy PROPAGATE Not implemented. .It >> Ns Sy TURN Oo .Ar ec Oo Ar file Li ... Oc ... .Oc Sy CHECKING Bro Oo Sy ON Oc Oo Oo Sy WITH Oc Sy LOCATION Oc | Sy OFF Brc Enable (or, with .Sy OFF , disable) exception condition .Ar ec optionally associated with the file connectors .Ar file . If .Sy LOCATION is specified, .Nm reports at runtime the source filename and line number of the statement that triggered the exception condition. .El . .Ss Feature-set Variables Some command-line options affect CDF .Em "feature-set" variables that are special to .Nm . They can be set and tested using .Sy >>DEFINE and .Sy >>IF , and are distinguished by a leading .Ql \&% in the name, which is otherwise invalid in a \*[lang] identifier: .Pp .Bl -tag -compact .It Sy %EBCDIC-MODE is set by .Fl finternal-ebcdic . .It Sy %64-BIT-POINTER is implied by .Fl "dialect ibm" . .El .Pp To set a feature-set variable, use .Dl >>SET Ar feature Li [AS] {ON | OFF} If .Ar feature is .Sy %EBCDIC-MODE , the directive must appear before .Sy PROGRAM-ID . .Pp To test a feature-set variable, use .Dl >>IF Ar feature Li DEFINED .. .Ss Copybooks .Nm supports the CDF .Sy COPY statement, with or without its .Sy REPLACING component. For any statement .sp .D1 COPY Ar copybook .sp .Nm looks first for an environment variable named .Va copybook and, if found, uses the contents of that variable as the name of the copybook file. If that file does not exist, it continues looking for a file named one of: .sp .Bl -bullet -compact -offset 5n .It .Pa copybook (literally) .It .Pa copybook.cpy .It .Pa copybook.CPY .It .Pa copybook.cbl .It .Pa copybook.CBL .It .Pa copybook.cob .It .Pa copybook.COB .El .sp in that order. It looks first in the same directory as the source code file, and then in any .Ar copybook-path named with the .Fl I option. . .\" FIXME: need escape mechanism for directories with ':' in the name. .Ar copybook-path may (like the shell's .Ev PATH variable) be a colon-separated list. . The .Fl I option may occur multiple times on the command line. Each successive .Ar copybook-path is concatenated to previous ones. Relative paths (having no leading .Ql / Ns \&) are searched relative to the compiler's current working directory. .Pp For example, .D1 \& .D1 Fl I Li /usr/local/include:include .D1 \& searches first the directory where the \*[lang] program is found, next in .Pa /usr/local/include , and finally in an .Pa include subdirectory of the directory from which .Nm was invoked. . .Ss Intrinsic functions .Nm implements all intrinsic functions defined by \*[isostd], plus a few others. They are listed alphabetically below. .Bl -item -compact .It ABS ACOS ANNUITY ASIN ATAN .It BASECONVERT BIT_OF BIT_TO_CHAR BOOLEAN_OF_INTEGER BYTE_LENGTH .It CHAR CHAR_NATIONAL COMBINED_DATETIME CONCAT CONVERT COS CURRENT_DATE .It DATE_OF_INTEGER DATE_TO_YYYYMMDD DAY_OF_INTEGER DAY_TO_YYYYDDD DISPLAY_OF .It E EXCEPTION_FILE EXCEPTION_FILE_N EXCEPTION_LOCATION EXCEPTION_LOCATION_N EXCEPTION_STATEMENT EXCEPTION_STATUS EXP EXP10 .It FACTORIAL FIND_STRING FORMATTED_CURRENT_DATE FORMATTED_DATE FORMATTED_DATETIME FORMATTED_TIME FRACTION_PART .It HEX_OF HEX_TO_CHAR HIGHEST_ALGEBRAIC .It INTEGER INTEGER_OF_BOOLEAN INTEGER_OF_DATE INTEGER_OF_DAY INTEGER_OF_FORMATTED_DATE INTEGER_PART .It LENGTH LOCALE_COMPARE LOCALE_DATE LOCALE_TIME LOCALE_TIME_FROM_SECONDS LOG LOG10 LOWER_CASE LOWEST_ALGEBRAIC .It MAX MEAN MEDIAN MIDRANGE MIN MOD MODULE_NAME .It NATIONAL_OF NUMVAL NUMVAL_C NUMVAL_F ORD .It ORD_MAX ORD_MIN .It PI PRESENT_VALUE .It RANDOM RANGE REM REVERSE .It SECONDS_FROM_FORMATTED_TIME SECONDS_PAST_MIDNIGHT SIGN SIN SMALLEST_ALGEBRAIC SQRT STANDARD_COMPARE STANDARD_DEVIATION SUBSTITUTE SUM .It TAN TEST_DATE_YYYYMMDD TEST_DAY_YYYYDDD TEST_FORMATTED_DATETIME TEST_NUMVAL TEST_NUMVAL_C TEST_NUMVAL_F TRIM .It ULENGTH UPOS UPPER_CASE USUBSTR USUPPLEMENTARY UUID4 UVALID UWIDTH .It VARIANCE .It WHEN_COMPILED .It YEAR_TO_YYYY .El . .Ss Binary floating point DISPLAY How the DISPLAY presents binary floating point numbers depends on the value. .Pp When a value has six or fewer decimal digits to the left of the decimal point, it is expressed as .Em 123456.789... . .Pp When a value is less than 1 and has no more than three zeroes to the right of the decimal point, it is expressed as .Em 0.0001234... . .Pp Otherwise, exponential notation is used: .Em 1.23456E+7 . .Pp In all cases, trailing zeroes on the right of the number are removed from the displayed value. .Pp .Bl -tag -compact -width FLOAT-EXTENDED .It COMP-1 displayed with 9 decimal digits. .It COMP-2 displayed with 17 decimal digits. .It FLOAT-EXTENDED displayed with 36 decimal digits. .El .Pp Those digit counts are consistent with the IEEE 754 requirements for information interchange. As one example, the description for COMP-2 binary64 values (per Wikipedia). .Pp If an IEEE 754 double-precision number is converted to a decimal string with at least 17 significant digits, and then converted back to double-precision representation, the final result must match the original number. .Pp 17 digits was chosen so that the .Sy DISPLAY statement shows the contents of a COMP-2 variable without hiding any information. . .Ss Binary floating point MOVE During a .Sy MOVE statement, a floating-point value may be truncated. It will not be unusual for Numeric Display values to be altered when moved through a floating-point value. .Pp This program: .Bd -literal 01 PICV999 PIC 9999V999. 01 COMP2 COMP-2. PROCEDURE DIVISION. MOVE 1.001 to PICV999 MOVE PICV999 TO COMP2 DISPLAY "The result of MOVE " PICV999 " TO COMP2 is " COMP2 MOVE COMP2 to PICV999 DISPLAY "The result of MOVE COMP2 TO PICV999 is " PICV999 .Ed .Pp generates this result: .Bd -literal The result of MOVE 0001.001 TO COMP2 is 1.00099999999999989 The result of MOVE COMP2 TO PICV999 is 0001.000 .Ed .Pp However, the internal implementation can produce results that might be seem surprising: .Bd -literal The result of MOVE 0055.110 TO COMP2 is 55.1099999999999994 The result of MOVE COMP2 TO PICV999 is 0055.110 .Ed .Pp The source of this inconsistency is the way .Nm stores and converts numbers. Converting the floating-point value to the numeric display value 0055110 is done by multiplying 55.109999...\& by 1,000 and then truncating the result to an integer. And it turns out that even though 55.11 can’t be represented in floating-point as an exact value, the product of the multiplication, 55110, is an exact value. .Pp In cases where it is important for conversions to have predictable results, we need to be able to apply rounding, which can be done with an arithmetic statement: .Bd -literal MOVE 1.001 to PICV999 MOVE PICV999 TO COMP2 DISPLAY "The result of MOVE " PICV999 " TO COMP2 is " COMP2 MOVE COMP2 to PICV999 DISPLAY "The result of MOVE COMP2 TO PICV999 is " PICV999 ADD COMP2 to ZERO GIVING PICV999 ROUNDED DISPLAY "The result of ADD COMP2 to ZERO GIVING PICV999 ROUNDED is " PICV999 .sp The result of MOVE 0001.001 TO COMP2 is 1.00099999999999989 The result of MOVE COMP2 TO PICV999 is 0001.000 The result of ADD COMP2 to ZERO GIVING PICV999 ROUNDED is 0001.001 .Ed .Ss Binary floating point computation .Nm attempts to do internal computations using binary integers when possible. Thus, simple arithmetic between binary values and numeric display values conclude with binary intermediate results. .Pp If a floating-point value gets included in the mix of variables specified for a calculation, then the intermediate result becomes a 128-bit floating-point value. . .Ss A warning about binary floating point comparison The cardinal rule when doing comparisons involving floating-point values: Never, ever, test for equality. It’s just not worth the hassle. .Pp For example: .Bd -literal WORKING-STORAGE SECTION. 01 COMP1 COMP-1 VALUE 555.11. 01 COMP2 COMP-2 VALUE 555.11. PROCEDURE DIVISION. DISPLAY "COMPARE " COMP1 " with " COMP2 IF COMP1 EQUAL COMP2 DISPLAY "Equal" ELSE DISPLAY "Not equal" END-IF .sp MOVE COMP1 to COMP2 DISPLAY "COMPARE " COMP1 " with " COMP2 IF COMP1 EQUAL COMP2 DISPLAY "Equal" ELSE DISPLAY "Not equal" END-IF .Ed .Pp the results: .Bd -literal COMPARE 555.1099854 with 555.110000000000014 Not equal COMPARE 555.1099854 with 555.1099853515625 Equal .Ed .Pp Why? Again, it has to do with the internals of .Nm . When differently sized floating-point values need to be compared, they are first converted to 128-bit floats. And it turns out that when a COMP1 is moved to a COMP2, and they are both converted to FLOAT-EXTENDED, the two resulting values are (probably) equal. .Pp Avoid testing for equality unless you really know what you are doing and you really test the code. And then avoid it anyway. .Pp Finally, it is observably the case that the .Nm implementations of floating-point conversions and comparisons don’t precisely match the behavior of other \*[lang] compilers. .Pp You have been warned. . .Sh ENVIRONMENT .Bl -tag -width COBPATH .It Ev COBPATH If defined, specifies the directory paths to be used by the .Nm runtime library, .Pa libgcobol.so , to locate shared objects. Like .Ev LD_LIBRARY_PATH , it may contain several directory names separated by a colon .Pq Ql \&: . .Ev COBPATH is searched first, followed by .Ev LD_LIBRARY_PATH . .Pp Each directory is searched for files whose name ends in .Ql ".so" . For each such file, .Xr dlopen 3 is attempted, and, if successful .Xr dlsym 3 . No relationship is defined between the symbol's name and the filename. .Pp Without .Ev COBPATH , binaries produced by .Nm behave as one might expect of any program compiled with gcc. Any shared objects needed by the program are mentioned on the command line with a .Fl l Ns Ar library option, and are found by following the executable's .Pa RPATH or otherwise per the configuration of the runtime linker, .Xr ld.so 8 . . .It Ev UPSI \*[lang] defines a User Programmable Status Indicator (UPSI) switch. In .Nm , the settings are denoted .Sy UPSI-0 through .Sy UPSI-7 , where 0-7 indicates a bit position. The value of the UPSI switches is taken from the .Ev UPSI environment variable, whose value is a string of up to eight 1's and 0's. The first character represents the value of .Sy UPSI-0 , and missing values are assigned 0. For example, .Sy UPSI=1000011 in the environment sets bits 0, 5, and 6 on, which means that .Sy UPSI-0 , .Sy UPSI-5 , and .Sy UPSI-6 are on. .It Ev GCOBOL_TEMPDIR causes any temporary files created during CDF processing to be written to a file whose name is specified in the value of .Ev GCOBOL_TEMPDIR . If the value is just .Dq / , the effect is different: each copybook read is reported on standard error. This feature is meant to help diagnose mysterious copybook errors. .El . .Sh FILES Executables produced by .Nm require the runtime support library .Pa libgcobol , which is provided both as a static library and as a shared object. . .\" .Sh DIAGNOSTICS . .Sh COMPATIBILITY The ISO standard leaves the default file organization up to the implementation; in .Nm , the default is .Sy "SEQUENTIAL" . . .Ss On-Disk Format Any ability to use files produced by other \*[lang] compilers, or for those compilers to use files produced by .Nm , is the product of luck and intuition. Various compilers interpret the ISO standard differently, and the standard's text is not always definitive. .Pp For .Sy "ORGANIZATION IS LINE SEQUENTIAL" files (explicitly or by default), .Nm , absent specific direction, produces an ordinary Linux text file: for each WRITE, the data are written, followed by an ASCII NL (hex 0A) character. On READ, the record is read up to the size of the specified record or NL, whichever comes first. The NL is not included in the data brought into the record buffer; it serves only as an on-disk record-termination marker. Consequently, .Sy SEQUENTIAL and .Sy "LINE SEQUENTIAL" files work the same way: the \*[lang] program never sees the record terminator. .Pp When .Sy READ and .Sy WRITE are used with .Sy ADVANCING , however, the game changes. If .Sy ADVANCING is used with .Sy "LINE SEQUENTIAL" files, it is honored by .Nm . .Pp Other compilers may not do likewise. According to ISO, in .Sy WRITE (14.9.47.3 General rules) .Sy ADVANCING is .Em ignored for files for which .Dq "the physical file does not support vertical positioning" . It further states that, in the absence of .Sy ADVANCING , .Sy WRITE proceeds as if .Dq "as if the user has specified AFTER ADVANCING 1 LINE" . Some other implementations interpret that to mean that the first .Sy WRITE to a .Sy "LINE SEQUENTIAL" file results in a leading NL on the first line, and no trailing NL on the last line. Some furthermore .Em prohibit the use of .Sy ADVANCING with .Sy "LINE SEQUENTIAL" files. . .\" .Sh SEE ALSO . .Sh STANDARDS The reference standard for .Nm is \*[isostd]. .Bl -bullet -compact .It If .Nm compiles code consistent with that standard, the resulting program should execute correctly; any other result is a bug. .It If .Nm compiles code that does not comply with that standard, but runs correctly according to some other specification, that represents a non-standard extension. One day, the .Fl pedantic option will produce diagnostic messages for such code. .It If .Nm rejects code consistent with that standard, that represents an aspect of \*[lang] that is (or is not) on the To Do list. If you would like to see it compile, please get in touch with the developers. .El . .Ss Status of NIST \*[lang] Compiler Verification Suite .Bl -tag -compact -width "\0\0100% NC" .It NC 100% Nucleus .It SQ 100% Sequential I/O .It RL 100% Relative I/O .It IX 100% Indexed I/O .It IC 100% Inter-Program Communication .It ST 100% Sort-Merge .It SM 100% Source Text Manipulation RW \en Report Writer .It CM Communication .It DB to do? Debug .It SG Segmentation .It IF 100% Intrinsic Function .El .Pp Where .Nm passes 100% of the tests in a module, we exclude the (few) tests for obsolete features. The authors regard features that were obsolete in 1985 to be well and truly obsolete today, and did not implement them. . .Ss Notable deferred features CCVS-85 modules not marked with above with any status (CM, and SG) are on the .Dq "hard maybe" list, meaning they await an interested party with real code using the feature. .Pp .Nm does not implement Report Writer or Screen Section. . .Ss Beyond COBOL/85 .Nm increasingly implements \*[isostd]. For example, .Sy DECLARATIVES is not tested by CCVS-85, but are implemented by .Nm Ns . Similarly, Exception Conditions were not defined in 1985, and .Nm contains a growing number of them. .Pp The authors are well aware that a complete, pure \*[lang]-85 compiler won't compile most existing \*[lang] code. Every vendor offered (and offers) extensions, and most environments rely on a variety of preprocessors and ancillary systems defined outside the standard. The express goal of adding an ISO \*[lang] front-end to GCC is to establish a foundation on which any needed extensions can be built. . .Sh HISTORY \*[lang], the language, may well be older than the reader. To the author's knowledge, free \*[lang] compilers first began to appear in 2000. Around that time an earlier \*[lang] for GCC project .br .Lk https://cobolforgcc.sourceforge.net/ cobolforgcc met with some success, but was never officially merged into GCC. .Pp This compiler, .Nm , was begun by .Lk https://www.cobolworx.com/ COBOLworx in the fall of 2021. The project announced a complete implementation of the core language features in December 2022. . .Sh AUTHORS .Bl -tag -compact .It "James K. Lowden" (jklowden@cobolworx.com) is responsible for the parser. .It "Robert Dubner" (rdubner@cobolworx.com) is responsible for producing the GIMPLE tree, which is input to the GCC back-end. .El . .Sh CAVEATS .Bl -bullet -compact .It .Nm has been tested only on x64 and Apple M1 processors running Linux in 64-bit mode. .It The I/O support has not been extensively tested, and does not implement or emulate many features related to VSAM and other mainframe subsystems. While LINE-SEQUENTIAL files are ordinary text files that can be manipulated with standard utilities, INDEXED and RELATIVE files produced by .Nm are not compatible with that of any other \*[lang] compiler. Enhancements to the I/O support will be readily available to the paying customer. .El . .\" .Sh BUGS