Chapter 8 Batch compilation (ocamlc)
This chapter describes the OCaml batch compiler ocamlc,
which compiles Caml source files to bytecode object files and links
these object files to produce standalone bytecode executable files.
These executable files are then run by the bytecode interpreter
ocamlrun.
8.1 Overview of the compiler
The ocamlc command has a command-line interface similar to the one of
most C compilers. It accepts several types of arguments and processes them
sequentially:
-
Arguments ending in .mli are taken to be source files for
compilation unit interfaces. Interfaces specify the names exported by
compilation units: they declare value names with their types, define
public data types, declare abstract data types, and so on. From the
file x.mli, the ocamlc compiler produces a compiled interface
in the file x.cmi.
- Arguments ending in .ml are taken to be source files for compilation
unit implementations. Implementations provide definitions for the
names exported by the unit, and also contain expressions to be
evaluated for their side-effects. From the file x.ml, the ocamlc
compiler produces compiled object bytecode in the file x.cmo.
If the interface file x.mli exists, the implementation
x.ml is checked against the corresponding compiled interface
x.cmi, which is assumed to exist. If no interface
x.mli is provided, the compilation of x.ml produces a
compiled interface file x.cmi in addition to the compiled
object code file x.cmo. The file x.cmi produced
corresponds to an interface that exports everything that is defined in
the implementation x.ml.
- Arguments ending in .cmo are taken to be compiled object bytecode. These
files are linked together, along with the object files obtained
by compiling .ml arguments (if any), and the OCaml standard
library, to produce a standalone executable program. The order in
which .cmo and .ml arguments are presented on the command line is
relevant: compilation units are initialized in that order at
run-time, and it is a link-time error to use a component of a unit
before having initialized it. Hence, a given x.cmo file must come
before all .cmo files that refer to the unit x.
- Arguments ending in .cma are taken to be libraries of object bytecode.
A library of object bytecode packs in a single file a set of object
bytecode files (.cmo files). Libraries are built with ocamlc -a
(see the description of the -a option below). The object files
contained in the library are linked as regular .cmo files (see
above), in the order specified when the .cma file was built. The
only difference is that if an object file contained in a library is
not referenced anywhere in the program, then it is not linked in.
- Arguments ending in .c are passed to the C compiler, which generates
a .o object file (.obj under Windows). This object file is linked
with the program if the -custom flag is set (see the description of
-custom below).
- Arguments ending in .o or .a (.obj or .lib under Windows)
are assumed to be C object files and libraries. They are passed to the
C linker when linking in -custom mode (see the description of
-custom below).
- Arguments ending in .so (.dll under Windows)
are assumed to be C shared libraries (DLLs). During linking, they are
searched for external C functions referenced from the Caml code,
and their names are written in the generated bytecode executable.
The run-time system ocamlrun then loads them dynamically at program
start-up time.
The output of the linking phase is a file containing compiled bytecode
that can be executed by the OCaml bytecode interpreter:
the command named ocamlrun. If caml.out is the name of the file
produced by the linking phase, the command
ocamlrun caml.out arg1 arg2 … argn
executes the compiled code contained in caml.out, passing it as
arguments the character strings arg1 to argn.
(See chapter 10 for more details.)
On most systems, the file produced by the linking
phase can be run directly, as in:
./caml.out arg1 arg2 … argn
The produced file has the executable bit set, and it manages to launch
the bytecode interpreter by itself.
8.2 Options
The following command-line options are recognized by ocamlc.
The options -pack, -a, -c and -output-obj are mutually exclusive.
- -a
-
Build a library (.cma file) with the object files (.cmo files)
given on the command line, instead of linking them into an executable
file. The name of the library must be set with the -o option.
If -custom, -cclib or -ccopt options are passed on the command
line, these options are stored in the resulting .cma library. Then,
linking with this library automatically adds back the -custom,
-cclib and -ccopt options as if they had been provided on the
command line, unless the -noautolink option is given.
- -annot
-
Dump detailed information about the compilation (types, bindings,
tail-calls, etc). The information for file src.ml
is put into file src.annot. In case of a type error, dump
all the information inferred by the type-checker before the error.
The src.annot file can be used with the emacs commands given in
emacs/caml-types.el to display types and other annotations
interactively.
- -c
-
Compile only. Suppress the linking phase of the
compilation. Source code files are turned into compiled files, but no
executable file is produced. This option is useful to
compile modules separately.
- -cc ccomp
-
Use ccomp as the C linker when linking in “custom runtime”
mode (see the -custom option)
and as the C compiler for compiling .c source files.
- -cclib -llibname
-
Pass the -llibname option to the C linker when linking in
“custom runtime” mode (see the -custom option). This causes the
given C library to be linked with the program.
- -ccopt option
-
Pass the given option to the C compiler and linker. When linking in
“custom runtime” mode, for instance,
-ccopt -Ldir causes the C linker to search for C libraries in
directory dir. (See the -custom option.)
- -config
-
Print the version number of ocamlc and a detailed summary of its
configuration, then exit.
- -custom
-
Link in “custom runtime” mode. In the default linking mode, the
linker produces bytecode that is intended to be executed with the
shared runtime system, ocamlrun. In the custom runtime mode, the
linker produces an output file that contains both the runtime system
and the bytecode for the program. The resulting file is larger, but it
can be executed directly, even if the ocamlrun command is not
installed. Moreover, the “custom runtime” mode enables static
linking of Caml code with user-defined C functions, as described in
chapter 18.
Unix:
Never use the strip command on executables produced by ocamlc -custom,
this would remove the bytecode part of the executable.
- -dllib -llibname
-
Arrange for the C shared library dlllibname.so
(dlllibname.dll under Windows) to be loaded dynamically
by the run-time system ocamlrun at program start-up time.
- -dllpath dir
-
Adds the directory dir to the run-time search path for shared
C libraries. At link-time, shared libraries are searched in the
standard search path (the one corresponding to the -I option).
The -dllpath option simply stores dir in the produced
executable file, where ocamlrun can find it and use it as
described in section 10.3.
- -g
-
Add debugging information while compiling and linking. This option is
required in order to be able to debug the program with ocamldebug
(see chapter 16), and to produce stack backtraces when
the program terminates on an uncaught exception (see
section 10.2).
- -help-warnings
-
Show description for all available warning numbers.
- -i
-
Cause the compiler to print all defined names (with their inferred
types or their definitions) when compiling an implementation (.ml
file). No compiled files (.cmo and .cmi files) are produced.
This can be useful to check the types inferred by the
compiler. Also, since the output follows the syntax of interfaces, it
can help in writing an explicit interface (.mli file) for a file:
just redirect the standard output of the compiler to a .mli file,
and edit that file to remove all declarations of unexported names.
- -I directory
-
Add the given directory to the list of directories searched for
compiled interface files (.cmi), compiled object code files
(.cmo), libraries (.cma), and C libraries specified with
-cclib -lxxx. By default, the current directory is
searched first, then the standard library directory. Directories added
with -I are searched after the current directory, in the order in
which they were given on the command line, but before the standard
library directory.
If the given directory starts with +, it is taken relative to the
standard library directory. For instance, -I +labltk adds the
subdirectory labltk of the standard library to the search path.
- -impl filename
-
Compile the file filename as an implementation file, even if its
extension is not .ml.
- -intf filename
-
Compile the file filename as an interface file, even if its
extension is not .mli.
- -intf-suffix string
-
Recognize file names ending with string as interface files
(instead of the default .mli).
- -labels
-
Labels are not ignored in types, labels may be used in applications,
and labelled parameters can be given in any order. This is the default.
- -linkall
-
Force all modules contained in libraries to be linked in. If this
flag is not given, unreferenced modules are not linked in. When
building a library (option -a), setting the -linkall option forces all
subsequent links of programs involving that library to link all the
modules contained in the library.
- -make-runtime
-
Build a custom runtime system (in the file specified by option -o)
incorporating the C object files and libraries given on the command
line. This custom runtime system can be used later to execute
bytecode executables produced with the
ocamlc -use-runtime runtime-name option.
See section 18.1.6 for more information.
- -noassert
-
Do not compile assertion checks. Note that the special form
assert false is always compiled because it is typed specially.
This flag has no effect when linking already-compiled files.
- -noautolink
-
When linking .cma libraries, ignore -custom, -cclib and -ccopt
options potentially contained in the libraries (if these options were
given when building the libraries). This can be useful if a library
contains incorrect specifications of C libraries or C options; in this
case, during linking, set -noautolink and pass the correct C
libraries and options on the command line.
- -nolabels
-
Ignore non-optional labels in types. Labels cannot be used in
applications, and parameter order becomes strict.
- -o exec-file
-
Specify the name of the output file produced by the compiler. The
default output name is a.out under Unix and camlprog.exe under
Windows. If the -a option is given, specify the name of the library
produced. If the -pack option is given, specify the name of the
packed object file produced. If the -output-obj option is given,
specify the name of the output file produced. If the -c option is
given, specify the name of the object file produced for the next
source file that appears on the command line.
- -output-obj
-
Cause the linker to produce a C object file instead of a bytecode
executable file. This is useful to wrap Caml code as a C library,
callable from any C program. See chapter 18,
section 18.7.5. The name of the output object file is
camlprog.o by default; it can be set with the -o option. This
option can also be used to produce a C source file (.c extension) or
a compiled shared/dynamic library (.so extension, .dll under Windows).
- -pack
-
Build a bytecode object file (.cmo file) and its associated compiled
interface (.cmi) that combines the object
files given on the command line, making them appear as sub-modules of
the output .cmo file. The name of the output .cmo file must be
given with the -o option. For instance,
ocamlc -pack -o p.cmo a.cmo b.cmo c.cmo
generates compiled files p.cmo and p.cmi describing a compilation
unit having three sub-modules A, B and C, corresponding to the
contents of the object files a.cmo, b.cmo and c.cmo. These
contents can be referenced as P.A, P.B and P.C in the remainder
of the program. - -pp command
-
Cause the compiler to call the given command as a preprocessor
for each source file. The output of command is redirected to
an intermediate file, which is compiled. If there are no compilation
errors, the intermediate file is deleted afterwards.
- -principal
-
Check information path during type-checking, to make sure that all
types are derived in a principal way. When using labelled arguments
and/or polymorphic methods, this flag is required to ensure future
versions of the compiler will be able to infer types correctly, even
if internal algorithms change.
All programs accepted in -principal mode are also accepted in the
default mode with equivalent types, but different binary signatures,
and this may slow down type checking; yet it is a good idea to
use it once before publishing source code.
- -rectypes
-
Allow arbitrary recursive types during type-checking. By default,
only recursive types where the recursion goes through an object type
are supported. Note that once you have created an interface using this
flag, you must use it again for all dependencies.
- -runtime-variant suffix
-
Add the suffix string to the name of the runtime library used by
the program. Currently, only one such suffix is supported: d, and
only if the OCaml compiler was configured with option
-with-debug-runtime. This suffix gives the debug version of the
runtime, which is useful for debugging pointer problems in low-level
code such as C stubs.
- -thread
-
Compile or link multithreaded programs, in combination with the
system threads library described in chapter 24.
- -unsafe
-
Turn bound checking off for array and string accesses (the v.(i) and
s.[i] constructs). Programs compiled with -unsafe are therefore
slightly faster, but unsafe: anything can happen if the program
accesses an array or string outside of its bounds.
- -use-runtime runtime-name
-
Generate a bytecode executable file that can be executed on the custom
runtime system runtime-name, built earlier with
ocamlc -make-runtime runtime-name.
See section 18.1.6 for more information.
- -v
-
Print the version number of the compiler and the location of the
standard library directory, then exit.
- -verbose
-
Print all external commands before they are executed, in particular
invocations of the C compiler and linker in -custom mode. Useful to
debug C library problems.
- -vnum or -version
-
Print the version number of the compiler in short form (e.g. 3.11.0),
then exit.
- -vmthread
-
Compile or link multithreaded programs, in combination with the
VM-level threads library described in chapter 24.
- -w warning-list
-
Enable, disable, or mark as errors the warnings specified by the argument
warning-list.
Each warning can be enabled or disabled, and each warning
can be marked or unmarked.
If a warning is disabled, it isn’t displayed and doesn’t affect
compilation in any way (even if it is marked). If a warning is
enabled, it is displayed normally by the compiler whenever the source
code triggers it. If it is enabled and marked, the compiler will stop
with an error after displaying that warning if the source code
triggers it.
The warning-list argument is a sequence of warning specifiers,
with no separators between them. A warning specifier is one of the
following:
-
+num
- Enable warning number num.
- -num
- Disable warning number num.
- @num
- Enable and mark warning number num.
- +num1..num2
- Enable warnings in the given range.
- -num1..num2
- Disable warnings in the given range.
- @num1..num2
- Enable and mark warnings in the given range.
- +letter
- Enable the set of warnings corresponding to
letter. The letter may be uppercase or lowercase.
- -letter
- Disable the set of warnings corresponding to
letter. The letter may be uppercase or lowercase.
- @letter
- Enable and mark the set of warnings
corresponding to letter. The letter may be uppercase or
lowercase.
- uppercase-letter
- Enable the set of warnings corresponding
to uppercase-letter.
- lowercase-letter
- Disable the set of warnings corresponding
to lowercase-letter.
Warning numbers which are out of the range of warnings that are currently
defined are ignored. The warning numbers are as follows.
-
1
- Suspicious-looking start-of-comment mark.
- 2
- Suspicious-looking end-of-comment mark.
- 3
- Deprecated syntax.
- 4
- Fragile pattern matching: matching that will remain complete even
if additional constructors are added to one of the variant types
matched.
- 5
- Partially applied function: expression whose result has function
type and is ignored.
- 6
- Label omitted in function application.
- 7
- Some methods are overridden in the class where they are defined.
- 8
- Partial match: missing cases in pattern-matching.
- 9
- Missing fields in a record pattern.
- 10
- Expression on the left-hand side of a sequence that doesn’t have type
"unit" (and that is not a function, see warning number 5).
- 11
- Redundant case in a pattern matching (unused match case).
- 12
- Redundant sub-pattern in a pattern-matching.
- 13
- Override of an instance variable.
- 14
- Illegal backslash escape in a string constant.
- 15
- Private method made public implicitly.
- 16
- Unerasable optional argument.
- 17
- Undeclared virtual method.
- 18
- Non-principal type.
- 19
- Type without principality.
- 20
- Unused function argument.
- 21
- Non-returning statement.
- 22
- Camlp4 warning.
- 23
- Useless record "with" clause.
- 24
- Bad module name: the source file name is not a valid OCaml module name.
- 25
- Pattern-matching with all clauses guarded. Exhaustiveness cannot be
checked
- 26
- Suspicious unused variable: unused variable that is bound with "let"
or "as", and doesn’t start with an underscore ("_") character.
- 27
- Innocuous unused variable: unused variable that is not bound with
"let" nor "as", and doesn’t start with an underscore ("_")
character.
- 28
- Wildcard pattern given as argument to a constant constructor.
- 29
- Unescaped end-of-line in a string constant (non-portable code).
- 30
- Two labels or constructors of the same name are defined in two
mutually recursive types.
The letters stand for the following sets of warnings. Any letter not
mentioned here corresponds to the empty set.
-
A
- all warnings
- C
- 1, 2
- D
- 3
- E
- 4
- F
- 5
- L
- 6
- M
- 7
- P
- 8
- R
- 9
- S
- 10
- U
- 11, 12
- V
- 13
- X
- 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- Y
- 26
- Z
- 27
The default setting is -w +a-4-6-7-9-27..29.
Note that warnings 5 and 10 are not always triggered, depending on
the internals of the type checker.
- -warn-error warning-list
-
Mark as errors the warnings specified in the argument warning-list.
The compiler will stop with an error when one of these warnings is
emitted. The warning-list has the same meaning as for
the -w option: a + sign (or an uppercase letter) turns the
corresponding warnings into errors, a -
sign (or a lowercase letter) turns them back into warnings, and a
@ sign both enables and marks the corresponding warnings.
Note: it is not recommended to use warning sets (i.e. letters) as
arguments to -warn-error
in production code, because this can break your build when future versions
of OCaml add some new warnings.
The default setting is -warn-error -a
(none of the warnings is treated as an error).
- -where
-
Print the location of the standard library, then exit.
- - file
-
Process file as a file name, even if it starts with a dash (-)
character.
- -help or --help
-
Display a short usage summary and exit.
8.3 Modules and the file system
This short section is intended to clarify the relationship between the
names of the modules corresponding to compilation units and the names
of the files that contain their compiled interface and compiled
implementation.
The compiler always derives the module name by taking the capitalized
base name of the source file (.ml or .mli file). That is, it
strips the leading directory name, if any, as well as the .ml or
.mli suffix; then, it set the first letter to uppercase, in order to
comply with the requirement that module names must be capitalized.
For instance, compiling the file mylib/misc.ml provides an
implementation for the module named Misc. Other compilation units
may refer to components defined in mylib/misc.ml under the names
Misc.name; they can also do open Misc, then use unqualified
names name.
The .cmi and .cmo files produced by the compiler have the same
base name as the source file. Hence, the compiled files always have
their base name equal (modulo capitalization of the first letter) to
the name of the module they describe (for .cmi files) or implement
(for .cmo files).
When the compiler encounters a reference to a free module identifier
Mod, it looks in the search path for a file named Mod.cmi or mod.cmi
and loads the compiled interface
contained in that file. As a consequence, renaming .cmi files is not
advised: the name of a .cmi file must always correspond to the name
of the compilation unit it implements. It is admissible to move them
to another directory, if their base name is preserved, and the correct
-I options are given to the compiler. The compiler will flag an
error if it loads a .cmi file that has been renamed.
Compiled bytecode files (.cmo files), on the other hand, can be
freely renamed once created. That’s because the linker never attempts
to find by itself the .cmo file that implements a module with a
given name: it relies instead on the user providing the list of .cmo
files by hand.
8.4 Common errors
This section describes and explains the most frequently encountered
error messages.
- Cannot find file filename
-
The named file could not be found in the current directory, nor in the
directories of the search path. The filename is either a
compiled interface file (.cmi file), or a compiled bytecode file
(.cmo file). If filename has the format mod.cmi, this
means you are trying to compile a file that references identifiers
from module mod, but you have not yet compiled an interface for
module mod. Fix: compile mod.mli or mod.ml
first, to create the compiled interface mod.cmi.
If filename has the format mod.cmo, this
means you are trying to link a bytecode object file that does not
exist yet. Fix: compile mod.ml first.
If your program spans several directories, this error can also appear
because you haven’t specified the directories to look into. Fix: add
the correct -I options to the command line.
- Corrupted compiled interface filename
-
The compiler produces this error when it tries to read a compiled
interface file (.cmi file) that has the wrong structure. This means
something went wrong when this .cmi file was written: the disk was
full, the compiler was interrupted in the middle of the file creation,
and so on. This error can also appear if a .cmi file is modified after
its creation by the compiler. Fix: remove the corrupted .cmi file,
and rebuild it.
- This expression has type t1, but is used with type t2
-
This is by far the most common type error in programs. Type t1 is
the type inferred for the expression (the part of the program that is
displayed in the error message), by looking at the expression itself.
Type t2 is the type expected by the context of the expression; it
is deduced by looking at how the value of this expression is used in
the rest of the program. If the two types t1 and t2 are not
compatible, then the error above is produced.
In some cases, it is hard to understand why the two types t1 and
t2 are incompatible. For instance, the compiler can report that
“expression of type foo cannot be used with type foo”, and it
really seems that the two types foo are compatible. This is not
always true. Two type constructors can have the same name, but
actually represent different types. This can happen if a type
constructor is redefined. Example:
type foo = A | B
let f = function A -> 0 | B -> 1
type foo = C | D
f C
This result in the error message “expression C of type foo cannot
be used with type foo”.
- The type of this expression, t, contains type variables
that cannot be generalized
-
Type variables (’a, ’b, …) in a type t can be in either
of two states: generalized (which means that the type t is valid
for all possible instantiations of the variables) and not generalized
(which means that the type t is valid only for one instantiation
of the variables). In a let binding let name = expr,
the type-checker normally generalizes as many type variables as
possible in the type of expr. However, this leads to unsoundness
(a well-typed program can crash) in conjunction with polymorphic
mutable data structures. To avoid this, generalization is performed at
let bindings only if the bound expression expr belongs to the
class of “syntactic values”, which includes constants, identifiers,
functions, tuples of syntactic values, etc. In all other cases (for
instance, expr is a function application), a polymorphic mutable
could have been created and generalization is therefore turned off for
all variables occuring in contravariant or non-variant branches of the
type. For instance, if the type of a non-value is ’a list the
variable is generalizable (list is a covariant type constructor),
but not in ’a list -> ’a list (the left branch of -> is
contravariant) or ’a ref (ref is non-variant).
Non-generalized type variables in a type cause no difficulties inside
a given structure or compilation unit (the contents of a .ml file,
or an interactive session), but they cannot be allowed inside
signatures nor in compiled interfaces (.cmi file), because they
could be used inconsistently later. Therefore, the compiler
flags an error when a structure or compilation unit defines a value
name whose type contains non-generalized type variables. There
are two ways to fix this error:
-
Add a type constraint or a .mli file to give a monomorphic
type (without type variables) to name. For instance, instead of
writing
let sort_int_list = Sort.list (<)
(* inferred type 'a list -> 'a list, with 'a not generalized *)
write
let sort_int_list = (Sort.list (<) : int list -> int list);;
- If you really need name to have a polymorphic type, turn
its defining expression into a function by adding an extra parameter.
For instance, instead of writing
let map_length = List.map Array.length
(* inferred type 'a array list -> int list, with 'a not generalized *)
write
let map_length lv = List.map Array.length lv
- Reference to undefined global mod
-
This error appears when trying to link an incomplete or incorrectly
ordered set of files. Either you have forgotten to provide an
implementation for the compilation unit named mod on the command line
(typically, the file named mod.cmo, or a library containing
that file). Fix: add the missing .ml or .cmo file to the command
line. Or, you have provided an implementation for the module named
mod, but it comes too late on the command line: the
implementation of mod must come before all bytecode object files
that reference mod. Fix: change the order of .ml and .cmo
files on the command line.
Of course, you will always encounter this error if you have mutually
recursive functions across modules. That is, function Mod1.f calls
function Mod2.g, and function Mod2.g calls function Mod1.f.
In this case, no matter what permutations you perform on the command
line, the program will be rejected at link-time. Fixes:
-
Put f and g in the same module.
- Parameterize one function by the other.
That is, instead of having
mod1.ml: let f x = ... Mod2.g ...
mod2.ml: let g y = ... Mod1.f ...
define
mod1.ml: let f g x = ... g ...
mod2.ml: let rec g y = ... Mod1.f g ...
and link mod1.cmo before mod2.cmo.
- Use a reference to hold one of the two functions, as in :
mod1.ml: let forward_g =
ref((fun x -> failwith "forward_g") : <type>)
let f x = ... !forward_g ...
mod2.ml: let g y = ... Mod1.f ...
let _ = Mod1.forward_g := g
- The external function f is not available
-
This error appears when trying to link code that calls external
functions written in C. As explained in
chapter 18, such code must be linked with C libraries that
implement the required f C function. If the C libraries in
question are not shared libraries (DLLs), the code must be linked in
“custom runtime” mode. Fix: add the required C libraries to the
command line, and possibly the -custom option.