Benchmark Author: Jan C. Vorbrüggen (Jan.Vorbrueggen@neuroinformatik.ruhr-uni-bochum.de)
Benchmark Program General Category: Image Processing
Benchmark Description:
This is an implementation of the face recognition system described in M. Lades et al. (1993), IEEE Trans. Comp. 42(3):300-311.
In this application, an object - here, faces photographed frontally - are represented as labeled graphs. In the simplest case, used here, the graph is a regular grid. To each vertex of the grid graph a set of features are attached; they are computed from the Gabor wavelet transform of the image and represent it in the surroundings of a vertex. An edge of the graph is labeled with the vector connecting its two vertices and represents the topographical relationship of those vertices.
An object represented in this way can now be compared to a new image in a process called elastic graph matching. This is done by first determining the Gabor wavelet transform for the new image. Then, for a given correspondance between the graph's vertices and a set of image points, a function taking into account both the similarity of the feature vectors at every vertex and its corresponding image point, and the distortion of the graph generated by the set of image points, measured as the change in the edge labels, can be computed. This graph similarity function is then the objective function of an optimization process that varies the set of corresponding points in the image. This optimization process is implemented in two steps: The global move step keeps the graph rigid and moves it systematically over all of the image, resulting in a placement that has the highest similarity to the graph. This step can be considered as finding the object (face) in the image. The local move step then takes this placement as the starting position, and visits every vertex in random order. At each vertex, the similarity function is evaluated on a small subgrid surrounding the current position. (This is a small change from the algorithm as originally published, where the trial moves at each node were random as well.) If the similarity function's value is improved at one of those positions, the change is made permanent; such a move is called a hop. One round visiting each vertex position is called a sweep. The local move step terminates when a sweep is completed without a hop having been performed.
The benchmark consists of the following main phases:
Input Description:
The program first reads a set of text files that contain the parameter
settings governing the current run. It then reads a number of grey-level
images (256 by 256 pixels in size) of as many different persons' faces,
which constitute the album gallery to perform comparisons
on, and computes the labeled graphs representing these faces. It then
reads another set of images, the probe gallery, of the same
persons as are represented in the album gallery, but which differ in
head pose, hair style, addition or removal of glasses, and so on. For
each of these, it computes the Gabor wavelet transform, localizes the
face using the global move step, and performs the local move step for
each graph from the album gallery. This sequence of events is similar
to what would be required when a set of persons, of which a reference
image each is given, are searched for in a larger database, e.g., one
extracted from a set of TV clips. This form of an image database search
is one typical use of the original application.
For the test data set, the album and probe gallery each contain two images. One of these images is the same as the canonic image mentioned above; the result of this comparison is known a priori and is used as a consistency check for the implementation.
For the training data set, the album gallery contains 13 images and the probe gallery contains 26 images (two per person in the album gallery).
For the reference data set, the album gallery contains 42 images and the probe gallery contains 84 images (one to three per person in the album gallery); these are all distinct from the images in the training data set.
Output Description:
On startup, the program first prints all parameters governing the run
to standard output. It then prints a line for each entry in the album
gallery with the name of the image file, the position of the labeled
graph as determined by the global move step, and its similarity with the
canonic graph. For each of the entries in the probe gallery, the same
information is output; this is followed by a line giving the name of the
best match from the album gallery and the similarity computed with the
corresponding graph. Finally, after all comparisons have been performed, a
summary is printed reporting the total number of comparisons, the number
of correct and incorrect comparisons, and the total number of hops and
of sweeps performed.
During the comparison process, for each entry in the probe gallery the number of hops and of sweeps, seperately for the best matching entry and for all entries in the album gallery accumulated, is reported to a secondary output file named hops.dat. This file must satisfy somewhat less stringent requirements for comparison to the reference output than the primary report to standard output described above.
Of the 84 entries in the probe gallery of the reference data set, 80 entries (95%) result in the correct entry from the album gallery being the best match.
Programming Language: Fortran90
Known Portability Issues: None
References:
M. Lades et al. (1993), Distortion Invariant Object Recognition in the
Dynamic Link Architecture, IEEE Trans. Comp. 42(3):300-311.
An extensive set of pages with many figures that illustrate the components
of the system described above.