Chapter 11 Native-code compilation (ocamlopt)

This chapter describes the OCaml high-performance native-code compiler ocamlopt, which compiles Caml source files to native code object files and link these object files to produce standalone executables.

The native-code compiler is only available on certain platforms. It produces code that runs faster than the bytecode produced by ocamlc, at the cost of increased compilation time and executable code size. Compatibility with the bytecode compiler is extremely high: the same source code should run identically when compiled with ocamlc and ocamlopt.

It is not possible to mix native-code object files produced by ocamlopt with bytecode object files produced by ocamlc: a program must be compiled entirely with ocamlopt or entirely with ocamlc. Native-code object files produced by ocamlopt cannot be loaded in the toplevel system ocaml.

11.1 Overview of the compiler

The ocamlopt command has a command-line interface very close to that of ocamlc. It accepts the same types of arguments, and processes them sequentially:

Arguments ending in .mli are taken to be source files for compilation unit interfaces. Interfaces specify the names exported by compilation units: they declare value names with their types, define public data types, declare abstract data types, and so on. From the file x.mli, the ocamlopt compiler produces a compiled interface in the file x.cmi. The interface produced is identical to that produced by the bytecode compiler ocamlc.
Arguments ending in .ml are taken to be source files for compilation unit implementations. Implementations provide definitions for the names exported by the unit, and also contain expressions to be evaluated for their side-effects. From the file x.ml, the ocamlopt compiler produces two files: x.o, containing native object code, and x.cmx, containing extra information for linking and optimization of the clients of the unit. The compiled implementation should always be referred to under the name x.cmx (when given a .o or .obj file, ocamlopt assumes that it contains code compiled from C, not from Caml).
The implementation is checked against the interface file x.mli (if it exists) as described in the manual for ocamlc (chapter 8).
Arguments ending in .cmx are taken to be compiled object code. These files are linked together, along with the object files obtained by compiling .ml arguments (if any), and the Caml standard library, to produce a native-code executable program. The order in which .cmx and .ml arguments are presented on the command line is relevant: compilation units are initialized in that order at run-time, and it is a link-time error to use a component of a unit before having initialized it. Hence, a given x.cmx file must come before all .cmx files that refer to the unit x.
Arguments ending in .cmxa are taken to be libraries of object code. Such a library packs in two files (lib.cmxa and lib.a/.lib) a set of object files (.cmx and .o/.obj files). Libraries are build with ocamlopt -a (see the description of the -a option below). The object files contained in the library are linked as regular .cmx files (see above), in the order specified when the library was built. The only difference is that if an object file contained in a library is not referenced anywhere in the program, then it is not linked in.
Arguments ending in .c are passed to the C compiler, which generates a .o/.obj object file. This object file is linked with the program.
Arguments ending in .o, .a or .so (.obj, .lib and .dll under Windows) are assumed to be C object files and libraries. They are linked with the program.

The output of the linking phase is a regular Unix or Windows executable file. It does not need ocamlrun to run.

11.2 Options

The following command-line options are recognized by ocamlopt. The options -pack, -a, -shared, -c and -output-obj are mutually exclusive.

-a

Build a library (.cmxa and .a/.lib files) with the object files (.cmx and .o/.obj files) given on the command line, instead of linking them into an executable file. The name of the library must be set with the -o option.

If -cclib or -ccopt options are passed on the command line, these options are stored in the resulting .cmxa library. Then, linking with this library automatically adds back the -cclib and -ccopt options as if they had been provided on the command line, unless the -noautolink option is given.

-annot

Dump detailed information about the compilation (types, bindings, tail-calls, etc). The information for file src.ml is put into file src.annot. In case of a type error, dump all the information inferred by the type-checker before the error. The src.annot file can be used with the emacs commands given in emacs/caml-types.el to display types and other annotations interactively.

-c

Compile only. Suppress the linking phase of the compilation. Source code files are turned into compiled files, but no executable file is produced. This option is useful to compile modules separately.

-cc ccomp

Use ccomp as the C linker called to build the final executable and as the C compiler for compiling .c source files.

-cclib -llibname

Pass the -llibname option to the linker. This causes the given C library to be linked with the program.

-ccopt option

Pass the given option to the C compiler and linker. For instance, -ccopt -Ldir causes the C linker to search for C libraries in directory dir.

-compact

Optimize the produced code for space rather than for time. This results in slightly smaller but slightly slower programs. The default is to optimize for speed.

-config

Print the version number of ocamlopt and a detailed summary of its configuration, then exit.

-for-pack module-path

Generate an object file (.cmx and .o/.obj files) that can later be included as a sub-module (with the given access path) of a compilation unit constructed with -pack. For instance, ocamlopt -for-pack P -c A.ml will generate a.cmx and a.o files that can later be used with ocamlopt -pack -o P.cmx a.cmx.

-g

Add debugging information while compiling and linking. This option is required in order to produce stack backtraces when the program terminates on an uncaught exception (see section 10.2).

-i

Cause the compiler to print all defined names (with their inferred types or their definitions) when compiling an implementation (.ml file). No compiled files (.cmo and .cmi files) are produced. This can be useful to check the types inferred by the compiler. Also, since the output follows the syntax of interfaces, it can help in writing an explicit interface (.mli file) for a file: just redirect the standard output of the compiler to a .mli file, and edit that file to remove all declarations of unexported names.

-I directory

Add the given directory to the list of directories searched for compiled interface files (.cmi), compiled object code files (.cmx), and libraries (.cmxa). By default, the current directory is searched first, then the standard library directory. Directories added with -I are searched after the current directory, in the order in which they were given on the command line, but before the standard library directory.

If the given directory starts with +, it is taken relative to the standard library directory. For instance, -I +labltk adds the subdirectory labltk of the standard library to the search path.

-inline n

Set aggressiveness of inlining to n, where n is a positive integer. Specifying -inline 0 prevents all functions from being inlined, except those whose body is smaller than the call site. Thus, inlining causes no expansion in code size. The default aggressiveness, -inline 1, allows slightly larger functions to be inlined, resulting in a slight expansion in code size. Higher values for the -inline option cause larger and larger functions to become candidate for inlining, but can result in a serious increase in code size.

-intf filename

Compile the file filename as an interface file, even if its extension is not .mli.

-intf-suffix string

Recognize file names ending with string as interface files (instead of the default .mli).

-labels

Labels are not ignored in types, labels may be used in applications, and labelled parameters can be given in any order. This is the default.

-linkall

Force all modules contained in libraries to be linked in. If this flag is not given, unreferenced modules are not linked in. When building a library (-a flag), setting the -linkall flag forces all subsequent links of programs involving that library to link all the modules contained in the library.

-noassert

Do not compile assertion checks. Note that the special form assert false is always compiled because it is typed specially. This flag has no effect when linking already-compiled files.

-noautolink

When linking .cmxa libraries, ignore -cclib and -ccopt options potentially contained in the libraries (if these options were given when building the libraries). This can be useful if a library contains incorrect specifications of C libraries or C options; in this case, during linking, set -noautolink and pass the correct C libraries and options on the command line.

-nodynlink

Allow the compiler to use some optimizations that are valid only for code that is never dynlinked.

-nolabels

Ignore non-optional labels in types. Labels cannot be used in applications, and parameter order becomes strict.

-o exec-file

Specify the name of the output file produced by the linker. The default output name is a.out under Unix and camlprog.exe under Windows. If the -a option is given, specify the name of the library produced. If the -pack option is given, specify the name of the packed object file produced. If the -output-obj option is given, specify the name of the output file produced. If the -shared option is given, specify the name of plugin file produced.

-output-obj

Cause the linker to produce a C object file instead of an executable file. This is useful to wrap Caml code as a C library, callable from any C program. See chapter 18, section 18.7.5. The name of the output object file is camlprog.o by default; it can be set with the -o option. This option can also be used to produce a compiled shared/dynamic library (.so extension, .dll under Windows).

-p

Generate extra code to write profile information when the program is executed. The profile information can then be examined with the analysis program gprof. (See chapter 17 for more information on profiling.) The -p option must be given both at compile-time and at link-time. Linking object files not compiled with -p is possible, but results in less precise profiling.

Unix: See the Unix manual page for gprof(1) for more information about the profiles.
Full support for gprof is only available for certain platforms (currently: Intel x86/Linux and Alpha/Digital Unix). On other platforms, the -p option will result in a less precise profile (no call graph information, only a time profile).

Windows: The -p option does not work under Windows.

-pack

Build an object file (.cmx and .o/.obj files) and its associated compiled interface (.cmi) that combines the .cmx object files given on the command line, making them appear as sub-modules of the output .cmx file. The name of the output .cmx file must be given with the -o option. For instance,

        ocamlopt -pack -o P.cmx A.cmx B.cmx C.cmx

generates compiled files P.cmx, P.o and P.cmi describing a compilation unit having three sub-modules A, B and C, corresponding to the contents of the object files A.cmx, B.cmx and C.cmx. These contents can be referenced as P.A, P.B and P.C in the remainder of the program.

The .cmx object files being combined must have been compiled with the appropriate -for-pack option. In the example above, A.cmx, B.cmx and C.cmx must have been compiled with ocamlopt -for-pack P.

Multiple levels of packing can be achieved by combining -pack with -for-pack. Consider the following example:

        ocamlopt -for-pack P.Q -c A.ml
        ocamlopt -pack -o Q.cmx -for-pack P A.cmx
        ocamlopt -for-pack P -c B.ml
        ocamlopt -pack -o P.cmx Q.cmx B.cmx

The resulting P.cmx object file has sub-modules P.Q, P.Q.A and P.B.

-pp command

Cause the compiler to call the given command as a preprocessor for each source file. The output of command is redirected to an intermediate file, which is compiled. If there are no compilation errors, the intermediate file is deleted afterwards.

-principal

Check information path during type-checking, to make sure that all types are derived in a principal way. All programs accepted in -principal mode are also accepted in default mode with equivalent types, but different binary signatures.

-rectypes

Allow arbitrary recursive types during type-checking. By default, only recursive types where the recursion goes through an object type are supported. Note that once you have created an interface using this flag, you must use it again for all dependencies.

-S

Keep the assembly code produced during the compilation. The assembly code for the source file x.ml is saved in the file x.s.

-shared

Build a plugin (usually .cmxs) that can be dynamically loaded with the Dynlink module. The name of the plugin must be set with the -o option. A plugin can include a number of Caml modules and libraries, and extra native objects (.o, .obj, .a, .lib files). Building native plugins is only supported for some operating system. Under some systems (currently, only Linux AMD 64), all the Caml code linked in a plugin must have been compiled without the -nodynlink flag. Some constraints might also apply to the way the extra native objects have been compiled (under Linux AMD 64, they must contain only position-independent code).

-thread

Compile or link multithreaded programs, in combination with the system threads library described in chapter 24.

-unsafe

Turn bound checking off for array and string accesses (the v.(i) and s.[i] constructs). Programs compiled with -unsafe are therefore faster, but unsafe: anything can happen if the program accesses an array or string outside of its bounds. Additionally, turn off the check for zero divisor in integer division and modulus operations. With -unsafe, an integer division (or modulus) by zero can halt the program or continue with an unspecified result instead of raising a Division_by_zero exception.

-v

Print the version number of the compiler and the location of the standard library directory, then exit.

-verbose

Print all external commands before they are executed, in particular invocations of the assembler, C compiler, and linker.

-vnum or -version

Print the version number of the compiler in short form (e.g. 3.11.0), then exit.

-w warning-list

Enable, disable, or mark as errors the warnings specified by the argument warning-list. Each warning can be enabled or disabled, and each warning can be marked or unmarked. If a warning is disabled, it isn’t displayed and doesn’t affect compilation in any way (even if it is marked). If a warning is enabled, it is displayed normally by the compiler whenever the source code triggers it. If it is enabled and marked, the compiler will stop with an error after displaying that warning if the source code triggers it.

The warning-list argument is a sequence of warning specifiers, with no separators between them. A warning specifier is one of the following:

+num: Enable warning number num.
-num: Disable warning number num.
@num: Enable and mark warning number num.
+num1..num2: Enable warnings in the given range.
-num1..num2: Disable warnings in the given range.
@num1..num2: Enable and mark warnings in the given range.
+letter: Enable the set of warnings corresponding to letter. The letter may be uppercase or lowercase.
-letter: Disable the set of warnings corresponding to letter. The letter may be uppercase or lowercase.
@letter: Enable and mark the set of warnings corresponding to letter. The letter may be uppercase or lowercase.
uppercase-letter: Enable the set of warnings corresponding to uppercase-letter.
lowercase-letter: Disable the set of warnings corresponding to lowercase-letter.

Warning numbers which are out of the range of warnings that are currently defined are ignored. The warning numbers are as follows.

1: Suspicious-looking start-of-comment mark.
2: Suspicious-looking end-of-comment mark.
3: Deprecated syntax.
4: Fragile pattern matching: matching that will remain complete even if additional constructors are added to one of the variant types matched.
5: Partially applied function: expression whose result has function type and is ignored.
6: Label omitted in function application.
7: Some methods are overridden in the class where they are defined.
8: Partial match: missing cases in pattern-matching.
9: Missing fields in a record pattern.
10: Expression on the left-hand side of a sequence that doesn’t have type "unit" (and that is not a function, see warning number 5).
11: Redundant case in a pattern matching (unused match case).
12: Redundant sub-pattern in a pattern-matching.
13: Override of an instance variable.
14: Illegal backslash escape in a string constant.
15: Private method made public implicitly.
16: Unerasable optional argument.
17: Undeclared virtual method.
18: Non-principal type.
19: Type without principality.
20: Unused function argument.
21: Non-returning statement.
22: Camlp4 warning.
23: Useless record "with" clause.
24: Bad module name: the source file name is not a valid OCaml module name.
25: Pattern-matching with all clauses guarded. Exhaustiveness cannot be checked
26: Suspicious unused variable: unused variable that is bound with "let" or "as", and doesn’t start with an underscore ("_") character.
27: Innocuous unused variable: unused variable that is not bound with "let" nor "as", and doesn’t start with an underscore ("_") character.
28: Wildcard pattern given as argument to a constant constructor.
29: Unescaped end-of-line in a string constant (non-portable code).
30: Two labels or constructors of the same name are defined in two mutually recursive types.

The letters stand for the following sets of warnings. Any letter not mentioned here corresponds to the empty set.

A: all warnings
C: 1, 2
D: 3
E: 4
F: 5
L: 6
M: 7
P: 8
R: 9
S: 10
U: 11, 12
V: 13
X: 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
Y: 26
Z: 27

The default setting is -w +a-4-6-7-9-27..29. Note that warnings 5 and 10 are not always triggered, depending on the internals of the type checker.

-warn-error warning-list

Mark as errors the warnings specified in the argument warning-list. The compiler will stop with an error when one of these warnings is emitted. The warning-list has the same meaning as for the -w option: a + sign (or an uppercase letter) turns the corresponding warnings into errors, a - sign (or a lowercase letter) turns them back into warnings, and a @ sign both enables and marks the corresponding warnings.

Note: it is not recommended to use warning sets (i.e. letters) as arguments to -warn-error in production code, because this can break your build when future versions of OCaml add some new warnings.

The default setting is -warn-error -a (none of the warnings is treated as an error).

-where

Print the location of the standard library, then exit.

- file

Process file as a file name, even if it starts with a dash (-) character.

-help or --help

Display a short usage summary and exit.

Options for the IA32 architecture

The IA32 code generator (Intel Pentium, AMD Athlon) supports the following additional option:

-ffast-math: Use the IA32 instructions to compute trigonometric and exponential functions, instead of calling the corresponding library routines. The functions affected are: atan, atan2, cos, log, log10, sin, sqrt and tan. The resulting code runs faster, but the range of supported arguments and the precision of the result can be reduced. In particular, trigonometric operations cos, sin, tan have their range reduced to [−2⁶⁴, 2⁶⁴].

Options for the AMD64 architecture

The AMD64 code generator (64-bit versions of Intel Pentium and AMD Athlon) supports the following additional options:

-fPIC: Generate position-independent machine code. This is the default.
-fno-PIC: Generate position-dependent machine code.

Options for the Sparc architecture

The Sparc code generator supports the following additional options:

-march=v8: Generate SPARC version 8 code.
-march=v9: Generate SPARC version 9 code.

The default is to generate code for SPARC version 7, which runs on all SPARC processors.

11.3 Common errors

The error messages are almost identical to those of ocamlc. See section 8.4.

11.4 Running executables produced by ocamlopt

Executables generated by ocamlopt are native, stand-alone executable files that can be invoked directly. They do not depend on the ocamlrun bytecode runtime system nor on dynamically-loaded C/Caml stub libraries.

During execution of an ocamlopt-generated executable, the following environment variables are also consulted:

OCAMLRUNPARAM: Same usage as in ocamlrun (see section 10.2), except that option l is ignored (the operating system’s stack size limit is used instead).
CAMLRUNPARAM: If OCAMLRUNPARAM is not found in the environment, then CAMLRUNPARAM will be used instead. If CAMLRUNPARAM is not found, then the default values will be used.

11.5 Compatibility with the bytecode compiler

This section lists the known incompatibilities between the bytecode compiler and the native-code compiler. Except on those points, the two compilers should generate code that behave identically.

Signals are detected only when the program performs an allocation in the heap. That is, if a signal is delivered while in a piece of code that does not allocate, its handler will not be called until the next heap allocation.
Stack overflow, typically caused by excessively deep recursion, is handled in one of the following ways, depending on the platform used:
- By raising a Stack_overflow exception, like the bytecode compiler does. (IA32/Linux, AMD64/Linux, PowerPC/MacOSX, MS Windows 32-bit ports).
- By aborting the program on a “segmentation fault” signal. (All other Unix systems.)
- By terminating the program silently. (MS Windows 64 bits).
On IA32 processors only (Intel Pentium, AMD Athlon, etc, in 32-bit mode), some intermediate results in floating-point computations are kept in extended precision rather than being rounded to double precision like the bytecode compiler always does. Floating-point results can therefore differ between bytecode and native code; in general, the results obtained with native code are “more exact” (less affected by rounding errors and loss of precision).
On the Alpha processor only, floating-point operations involving infinite or denormalized numbers can abort the program on a “floating-point exception” signal.