multimin: an interface for GSL optimization routines
Table of Contents
Introduction
multimin
is an interface to the GNU Scientific Library minimization
routines. It is a simple C function. When invoked, all the information
necessary to perform the minimization, that is the function, possibly
its derivatives, the initial conditions and the minimization options,
are passed as formal parameters.
One possible drawback of multimin
is that one can end up calling the
function with a pretty long, FORTRANlike, list of parameters. On the
other hand, a first advantage is that one can safely forget about the
details of the interior functioning of the GSL minimization
routines. A more fundamental advantage is however that the interface
automatically handles different kinds of boundary conditions (open,
closed, semiclosed). The way in which this is performed is by
transforming the problem using a change of coordinates. For instance,
assume one is interested in the minimum \(x_0\) of the function
\(f(x)\) on the closed interval \([a,b]\). Considering the change of
variable
\[ x=h(y)=\frac{a+b}{2} + \frac{ab}{2} \, \sin(y) \]
one can restate the problem above as finding the minimum \(y_0\) of the function \(f(h(y))\) on the open real line. Once the minimum (or better, one minimum) has been found, the solution of the original problem is obtained as \(x_0=h(y_0)\). The list of the implemented transformations can be found below.
Pointer to the objective function to be minimized and, when required, its derivatives, are passed as parameters to multimin. So the only programming effort for the user is to properly code these functions. The C function returning the value of the objective function at the point \(x\) should be coded as
void (*f) (const size_t n, const double *x,void *fparams,double *fval)
where n
is the dimension of the problem (number of variables),
fval
is the return value, x
is an array of type double of
dimension n
containing the coordinates where the objective function
has to be computed and fparams
is a pointer to a list of parameters,
different from the value of the variables, on which the function
depends. The function returning derivatives of the objective is coded
similarly (see below).
Download and install
Download the file multimin.c and multimin.h and copy it in the
directory of your choice. In order to work, the GNU scientific library
must be properly installed on your system. To compile a program using
multimin, remember to include the appropriate header file
multimin.h
. To test the function, download this file and compile it
using
gcc Wall multimin_test.c multimin.c lgsl lgslcblas o multimin_test
The example is discussed in the section below. For a list of recent
additions to multimin
you can check the ChangeLog
file distributed
with the software.
Calling convention
Let's now analyze the calling convention in details
multimin(size_t n,double *x,double *fmin, unsigned *type,double *xmin,double *xmax, void (*f) (const size_t,const double *,void *,double *), void (* df) (const size_t,const double *, void *,double *), void (* fdf) (const size_t,const double *, void *,double *,double *), void *fparams, const struct multimin_params oparams)
the list of the parameters is as follows (where INPUT stands for the input values of the parameter, while OUTPUT indicates their values when the function return):

n
 INPUT: dimension of the problem, number of independent variables of the function.

x
 INPUT: pointer to an array of
n
valuesx[0],...x[n1]
containing the initial estimate of the minimum point. OUTPUT: contains the final estimation of the minimum position 
fmin
 OUPUT: return the value of the function at the minimum

type
 INPUT a pointer to an array of
n
integerstype[1],...,type[n1]
describing the boundary conditions for the different variables. The problem is solved as an unconstrained one on a suitably transformed variable y. Possible values are:Table 1: Available transformations value interval function 0 unconstrained \(x=y\) 1 \([x_m,+\infty )\) \(x=x_m+y^2\) 2 \(( \infty,x_M ]\) \(x=x_My^2\) 3 \([ x_m,x_M ]\) \(x=SS+SD \sin(y)\) 4 \(( x_m,+\infty )\) \(x=x_m+\exp(y)\) 5 \(( \infty,x_M )\) \(x=x_M\exp(y)\) 6 \(( x_m,x_M )\) \(x=SS+SD \tanh(y)\) 7* \(( x_m,x_M )\) \(x=SS+SD (1+y/\sqrt{1+y^2})\) 8* \(( x_m,+\infty )\) \(x=x_m+.5 (y+\sqrt{1+y^2})\) 9* \(( \infty,x_M )\) \(x=x_M+.5 (y\sqrt{1+y^2})\) where \(SS=(x_m+x_M)/2\) and \(SD=(x_Mx_m)/2\).
(*) These are UNSUPPORTED transformations: they have been used for testing purposes but are not recommended.

xmin xmax
 INPUT: pointers to arrays of double containing
respectively the lower and upper boundaries of the
different variables. For a given variable, only the
values that are implied by the type of constraints,
defined as in
*type
, are actually inspected.

f
 function that calculates the objective function at a specified
point x. Its specification is
void (*f) (const size_t n, const double *x,void *fparams,double *fval)
where

n
 INPUT: the number of variables

x
 INPUT:the point at which the function is required

fparams
 INPUT: pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.

fval
 OUTPUT: the value of the objective function at the current point x.


df
 function that calculates the gradient of the objective
function at a specified point x. Its specification is
void (*df) (const size_t n, const double *x,void *fparams,double *grad)
where

n
 INPUT: the number of variables

x
 INPUT:the point at which the function is required

fparams
 INPUT: pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.

grad
 OUTPUT: the values of the gradient of the
objective function at the current point x are
stored in
grad[0],...,grad[n1]
.


fdf
 fdf calculates the value and the gradient of the objective
function at a specified point x. Its specification is
void (*fdf) (const size_t n, const double *x,void *fparams,double *fval,double *grad)
where

n
 INPUT: the number of variables

x
 INPUT:the point at which the function is required

fparams
 INPUT: pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.

fval
 OUTPUT: the value of the objective function at the current point x.

grad
 OUTPUT: the values of the gradient of the
objective function at the current point x are
stored in
grad[0],...,grad[n1]
.


fparams
 pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.

oparams
 structure of the type "multimin_{params}" containing the
optimization parameters. The members are

double step_size
 size of the first trial step

double tol
 accuracy of the line minimization

unsigned maxiter
 maximum number of iterations

double epsabs
 accuracy of the minimization

double maxsize
 final size of the simplex

unsigned method
 method to use. Possible values are:
 FletcherReeves conjugate gradient
 PolakRibiere conjugate gradient
 Vector BroydenFletcherGoldfarbShanno method
 Steepest descent algorithm
 NelderMead simplex
 Vector BroydenFletcherGoldfarbShanno ver. 2
 Simplex algorithm of Nelder and Mead ver. 2
 Simplex algorithm of Nelder and Mead: random initialization

unsigned verbosity
 if greater then 0 print info on intermediate steps

Example
Consider the same example described in the GSL maual: a twodimensional paraboloid function centerd in \((a,b)\) with scale factor \((A,B)\) and minimum \(C\):
\[ f(x,y) = A (xa)^2 + B (yb)^2 + C \;. \]
Assume the parameters \((a,b,A,B,C)\) are stored in a five dimensional
array named p
. The function returning the value of the objective can
be coded as follows
void f(const size_t n, const double *x,void *fparams,double *fval){ double *p = (double *)params; *fval=p[2]*(x[0]p[0])*(x[0]p[0]) + p[3]*(x[1]p[1])*(x[1]p[1]) + p[4]; }
while the function returning the derivatives reads
void df(const size_t n, const double *x,void *fparams,double *grad) { double *p = (double *)params; grad[0]=2*p[2]*(x[0]p[0]); grad[1]=2*p[3]*(x[1]p[1]); }
and, in addition, one needs a function returning both the value of the objective and of its derivative at the point
void fdf(const size_t n, const double *x,void *fparams,double *fval,double *grad) { double *p = (double *)params; *fval=p[2]*(x[0]p[0])*(x[0]p[0]) + p[3]*(x[1]p[1])*(x[1]p[1]) + p[4]; grad[0]=2*p[2]*(x[0]p[0]); grad[1]=2*p[3]*(x[1]p[1]); }
Once these functions are defined, the program to minimize the objective can simply be
int main(){ double par[5] = { 1.0, 2.0, 10.0, 20.0, 30.0 }; double x[2]={0,0}; double minimum; double xmin[2], xmax[2]; unsigned type[2]; struct multimin_params optim_par = {.1,1e2,100,1e3,1e5,2,0}; /* unconstrained minimization */ multimin(2,x,&minimum,NULL,NULL,NULL,&f,&df,&fdf,(void *) par,optim_par); printf("unconstrained minimum %e at x=%e y=%e\n",minimum,x[0],x[1]); /* minimum constrained in [1,2]x(5,5] */ type[0]=3; xmin[0]=1; xmax[0]=2; x[0]=0; type[1]=2; xmin[1]=5; xmax[1]=5; x[1]=0; /* increase verbosity */ optim_par.verbosity=1; multimin(2,x,&minimum,type,xmin,xmax,&f,&df,&fdf,par,optim_par); printf("constrained minimum %e at x=%e y=%e\n",minimum,x[0],x[1]); return 0; }
Notice that in the unconstrained minimization NULL
arrays have been
passed to the function. To search for a constrained minimum, the
arrays type
, xmin
and xmax
have been set accordingly. Of course,
in this case the two minima coincide.