e04xa computes an approximation to the gradient vector and/or the Hessian matrix for use in conjunction with, or following the use of an optimization method (such as e04uf).

Syntax

C#
public static void e04xa(
	int msglvl,
	int n,
	double epsrf,
	double[] x,
	ref int mode,
	E04..::..E04XA_OBJFUN objfun,
	double[] hforw,
	out double objf,
	double[] objgrd,
	double[] hcntrl,
	double[,] h,
	out int iwarn,
	int[] info,
	out int ifail
)
Visual Basic
Public Shared Sub e04xa ( _
	msglvl As Integer, _
	n As Integer, _
	epsrf As Double, _
	x As Double(), _
	ByRef mode As Integer, _
	objfun As E04..::..E04XA_OBJFUN, _
	hforw As Double(), _
	<OutAttribute> ByRef objf As Double, _
	objgrd As Double(), _
	hcntrl As Double(), _
	h As Double(,), _
	<OutAttribute> ByRef iwarn As Integer, _
	info As Integer(), _
	<OutAttribute> ByRef ifail As Integer _
)
Visual C++
public:
static void e04xa(
	int msglvl, 
	int n, 
	double epsrf, 
	array<double>^ x, 
	int% mode, 
	E04..::..E04XA_OBJFUN^ objfun, 
	array<double>^ hforw, 
	[OutAttribute] double% objf, 
	array<double>^ objgrd, 
	array<double>^ hcntrl, 
	array<double,2>^ h, 
	[OutAttribute] int% iwarn, 
	array<int>^ info, 
	[OutAttribute] int% ifail
)
F#
static member e04xa : 
        msglvl : int * 
        n : int * 
        epsrf : float * 
        x : float[] * 
        mode : int byref * 
        objfun : E04..::..E04XA_OBJFUN * 
        hforw : float[] * 
        objf : float byref * 
        objgrd : float[] * 
        hcntrl : float[] * 
        h : float[,] * 
        iwarn : int byref * 
        info : int[] * 
        ifail : int byref -> unit 

Parameters

msglvl
Type: System..::..Int32
On entry: must indicate the amount of intermediate output desired (see [Description of the Printed Output] for a description of the printed output). All output is written on the current advisory message unit (see (X04ABF not in this release)).
ValueDefinition
0No printout
1A summary is printed out for each variable plus any warning messages.
OtherValues other than 0 and 1 should normally be used only at the direction of NAG.
n
Type: System..::..Int32
On entry: the number n of independent variables.
Constraint: n1.
epsrf
Type: System..::..Double
On entry: must define eR, which is intended to be a measure of the accuracy with which the problem function F can be computed. The value of eR should reflect the relative precision of 1+Fx, i.e., acts as a relative precision when F is large, and as an absolute precision when F is small. For example, if Fx is typically of order 1000 and the first six significant digits are known to be correct, an appropriate value for eR would be 1.0E−6.
A discussion of epsrf is given in Chapter 8 of Gill et al. (1981). If epsrf is either too small or too large on entry a warning will be printed if msglvl=1, the parameter iwarn set to the appropriate value on exit and e04xa will use a default value of eM0.9, where eM is the machine precision.
If epsrf0.0 on entry then e04xa will use the default value internally. The default value will be appropriate for most simple functions that are computed with full accuracy.
x
Type: array<System..::..Double>[]()[][]
An array of size [n]
On entry: the point x at which the derivatives are to be computed.
mode
Type: System..::..Int32%
On entry: indicates which derivatives are required.
mode=0
The gradient and Hessian diagonal values having supplied the objective function via objfun.
mode=1
The Hessian matrix having supplied both the objective function and gradients via objfun.
mode=2
The gradient values and Hessian matrix having supplied the objective function via objfun.
On exit: is changed only if you set mode negative in objfun, i.e., you have requested termination of e04xa.
objfun
Type: NagLibrary..::..E04..::..E04XA_OBJFUN
If mode=0 or 2, objfun must calculate the objective function; otherwise if mode=1, objfun must calculate the objective function and the gradients.

A delegate of type E04XA_OBJFUN.

hforw
Type: array<System..::..Double>[]()[][]
An array of size [n]
On entry: the initial trial interval for computing the appropriate partial derivative to the jth variable.
If hforw[j-1]0.0, then the initial trial interval is computed by e04xa (see [Description]).
On exit: hforw[j-1] is the best interval found for computing a forward-difference approximation to the appropriate partial derivative for the jth variable.
objf
Type: System..::..Double%
On exit: the value of the objective function evaluated at the input vector in x.
objgrd
Type: array<System..::..Double>[]()[][]
An array of size [n]
On exit: if mode=0 or 2, objgrd[j-1] contains the best estimate of the first partial derivative for the jth variable.
If mode=1, objgrd[j-1] contains the first partial derivative for the jth variable evaluated at the input vector in x.
hcntrl
Type: array<System..::..Double>[]()[][]
An array of size [n]
On exit: hcntrl[j-1] is the best interval found for computing a central-difference approximation to the appropriate partial derivative for the jth variable.
h
Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, dim2]
Note: dim1 must satisfy the constraint: dim1n
Note: the second dimension of the array h must be at least 1 if mode=0 and at least n if mode=1 or 2.
On exit: if mode=0, the estimated Hessian diagonal elements are contained in the first column of this array.
If mode=1 or 2, the estimated Hessian matrix is contained in the leading n by n part of this array.
iwarn
Type: System..::..Int32%
On exit: iwarn=0 on successful exit.
If the value of epsrf on entry is too small or too large then iwarn is set to 1 or 2 respectively on exit and the default value for epsrf is used within e04xa.
If msglvl>0 then warnings will be printed if epsrf is too small or too large.
info
Type: array<System..::..Int32>[]()[][]
An array of size [n]
On exit: info[j-1] represents diagnostic information on variable j. (See [Error Indicators and Warnings] for more details.)
ifail
Type: System..::..Int32%
On exit: ifail=0 unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

Description

e04xa is similar to routine FDCALC described in Gill et al. (1983a). It should be noted that this method aims to compute sufficiently accurate estimates of the derivatives for use with an optimization algorithm. If you require more accurate estimates you should refer to (D04 not in this release).
e04xa computes finite difference approximations to the gradient vector and the Hessian matrix for a given function. The simplest approximation involves the forward-difference formula, in which the derivative fx of a univariate function fx is approximated by the quantity
ρFf,h=fx+h-fxh
for some interval h>0, where the subscript "F" denotes ‘forward-difference’ (see Gill et al. (1983b)).
To summarise the procedure used by e04xa (for the case when the objective function is available and you require estimates of gradient values and Hessian matrix diagonal values, i.e., mode=0) consider a univariate function f at the point x. (In order to obtain the gradient of a multivariate function Fx, where x is an n-vector, the procedure is applied to each component of x, keeping the other components fixed.) Roughly speaking, the method is based on the fact that the bound on the relative truncation error in the forward-difference approximation tends to be an increasing function of h, while the relative condition error bound is generally a decreasing function of h, hence changes in h will tend to have opposite effects on these errors (see Gill et al. (1983b)).
The ‘best’ interval h is given by
hF=21+fxeRΦ (1)
where Φ is an estimate of fx, and eR is an estimate of the relative error associated with computing the function (see Chapter 8 of Gill et al. (1981)). Given an interval h, Φ is defined by the second-order approximation
Φ=fx+h-2fx+fx-hh2.
The decision as to whether a given value of Φ is acceptable involves c^Φ, the following bound on the relative condition error in Φ:
c^Φ=4eR1+fh2Φ
(When Φ is zero, c^Φ is taken as an arbitrary large number.)
The procedure selects the interval hϕ (to be used in computing Φ) from a sequence of trial intervals hk. The initial trial interval is taken as 10h-, where
h-=21+xeR
unless you specify the initial value to be used.
The value of c^Φ for a trial value hk is defined as ‘acceptable’ if it lies in the interval 0.001,0.1. In this case hϕ is taken as hk, and the current value of Φ is used to compute hF from (1). If c^Φ is unacceptable, the next trial interval is chosen so that the relative condition error bound will either decrease or increase, as required. If the bound on the relative condition error is too large, a larger interval is used as the next trial value in an attempt to reduce the condition error bound. On the other hand, if the relative condition error bound is too small, hk is reduced.
The procedure will fail to produce an acceptable value of c^Φ in two situations. Firstly, if fx is extremely small, then c^Φ may never become small, even for a very large value of the interval. Alternatively, c^Φ may never exceed 0.001, even for a very small value of the interval. This usually implies that fx is extremely large, and occurs most often near a singularity.
As a check on the validity of the estimated first derivative, the procedure provides a comparison of the forward-difference approximation computed with hF (as above) and the central-difference approximation computed with hϕ. Using the central-difference formula the first derivative can be approximated by
ρcf,h=fx+h-fx-h2h
where h>0. If the values hF and hϕ do not display some agreement, neither can be considered reliable.
When both function and gradients are available and you require the Hessian matrix (i.e., mode=1) e04xa follows a similar procedure to the case above with the exception that the gradient function gx is substituted for the objective function and so the forward-difference interval for the first derivative of gx with respect to variable xj is computed. The jth column of the approximate Hessian matrix is then defined as in Chapter 2 of Gill et al. (1981), by
gx+hjej-gxhj
where hj is the best forward-difference interval associated with the jth component of g and ej is the vector with unity in the jth position and zeros elsewhere.
When only the objective function is available and you require the gradients and Hessian matrix (i.e., mode=2) e04xa again follows the same procedure as the case for mode=0 except that this time the value of c^Φ for a trial value hk is defined as acceptable if it lies in the interval 0.0001,0.01 and the initial trial interval is taken as
h-=21+xeR4.
The approximate Hessian matrix G is then defined as in Chapter 2 of Gill et al. (1981), by
Gijx=1hihjfx+hiei+hjej-fx+hiei-fx+hjej+fx.

References

Gill P E, Murray W, Saunders M A and Wright M H (1983a) Documentation for FDCALC and FDCORE Technical Report SOL 83–6 Stanford University
Gill P E, Murray W, Saunders M A and Wright M H (1983b) Computing forward-difference intervals for numerical optimization SIAM J. Sci. Statist. Comput. 4 310–321
Gill P E, Murray W and Wright M H (1981) Practical Optimization Academic Press

Error Indicators and Warnings

On exit from e04xa both diagnostic parameters info and ifail should be tested. ifail represents an overall diagnostic indicator, whereas the integer array info represents diagnostic information on each variable.
Errors or warnings detected by the method:
Some error messages may refer to parameters that are dropped from this interface (LDH) In these cases, an error in another parameter has usually caused an incorrect value to be inferred.
ifail<0
A negative value of ifail indicates an exit from e04xa because you set mode negative in objfun. The value of ifail will be the same as your setting of mode.
ifail=1
On entry, one or more of the following conditions are satisfied: n<1, ldh<n​ or ​mode is invalid.
ifail=2
One or more variables have a nonzero info value. This may not necessarily represent an unsuccessful exit – see diagnostic information on info.
ifail=-9000
An error occured, see message report.
ifail=-6000
Invalid Parameters value
ifail=-4000
Invalid dimension for array value
ifail=-8000
Negative dimension for array value
ifail=-6000
Invalid Parameters value
Diagnostic information returned via info is as follows:
Some error messages may refer to parameters that are dropped from this interface (LDH) In these cases, an error in another parameter has usually caused an incorrect value to be inferred.
info=1
The appropriate function appears to be constant. hforw[i-1] is set to the initial trial interval value (see [Description]) corresponding to a well-scaled problem and Error est. in the printed output is set to zero. This value occurs when the estimated relative condition error in the first derivative approximation is unacceptably large for every value of the finite difference interval. If this happens when the function is not constant the initial interval may be too small; in this case, it may be worthwhile to rerun e04xa with larger initial trial interval values supplied in hforw (see [Description]). This error may also occur if the function evaluation includes an inordinately large constant term or if epsrf is too large.
info=2
The appropriate function appears to be linear or odd. hforw[i-1] is set to the smallest interval with acceptable bounds on the relative condition error in the forward- and backward-difference estimates. In this case, the estimated relative condition error in the second derivative approximation remained large for every trial interval, but the estimated error in the first derivative approximation was acceptable for at least one interval. If the function is not linear or odd the relative condition error in the second derivative may be decreasing very slowly, it may be worthwhile to rerun e04xa with larger initial trial interval values supplied in hforw (see [Description]).
info=3
The second derivative of the appropriate function appears to be so large that it cannot be reliably estimated (i.e., near a singularity). hforw[i-1] is set to the smallest trial interval.
This value occurs when the relative condition error estimate in the second derivative remained very small for every trial interval.
If the second derivative is not large the relative condition error in the second derivative may be increasing very slowly. It may be worthwhile to rerun e04xa with smaller initial trial interval values supplied in hforw (see [Description]). This error may also occur when the given value of epsrf is not a good estimate of a bound on the absolute error in the appropriate function (i.e., epsrf is too small).
info=4
The algorithm terminated with an apparently acceptable estimate of the second derivative. However the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval) and the central difference estimates (computed with the interval used to compute the final estimate of the second derivative) do not agree to half a decimal place. The usual reason that the forward- and central-difference estimates fail to agree is that the first derivative is small.
If the first derivative is not small, it may be helpful to execute the procedure at a different point.
ifail=-9000
An error occured, see message report.
ifail=-6000
Invalid Parameters value
ifail=-4000
Invalid dimension for array value
ifail=-8000
Negative dimension for array value
ifail=-6000
Invalid Parameters value

Accuracy

If ifail=0 on exit the algorithm terminated successfully, i.e., the forward-difference estimates of the appropriate first derivatives (computed with the final estimate of the ‘optimal’ forward-difference interval hF) and the central-difference estimates (computed with the interval hϕ used to compute the final estimate of the second derivative) agree to at least half a decimal place.
In short word length implementations when computing the full Hessian matrix given function values only (i.e., mode=2) the elements of the computed Hessian will have at best 1 to 2 figures of accuracy.

Parallelism and Performance

None.

Further Comments

To evaluate an acceptable set of finite difference intervals for a well-scaled problem, the method will require around two function evaluations per variable; in a badly scaled problem however, as many as six function evaluations per variable may be needed.
If you request the full Hessian matrix supplying both function and gradients (i.e., mode=1) or function only (i.e., mode=2) then a further n or 3×n×n+1/2 function evaluations respectively are required.

Description of the Printed Output

The following is a description of the printed output from e04xa as controlled by the parameter msglvl.
Output when msglvl=1 is as follows:
J number of variable for which the difference interval has been computed.
Xj jth variable of x as set by you.
F. dif. int. the best interval found for computing a forward-difference approximation to the appropriate partial derivative with respect to the jth variable.
C. dif. int. the best interval found for computing a central-difference approximation to the appropriate partial derivative with respect to the jth variable.
Error est. a bound on the estimated error in the final forward-difference approximation. When info[j-1]=1, Error est. is set to zero.
Grad. est. best estimate of the first partial derivative with respect to the jth variable.
Hess diag est. best estimate of the second partial derivative with respect to the jth variable.
fun evals. the number of function evaluations used to compute the final difference intervals for the jth variable.
infoj the value of info for the jth variable.

Example

This example computes the gradient vector and the Hessian matrix of the following function:
Fx=x1+10x22+5x3-x42+x2-2x34+10x1-x44
at the point 3,-1,0,1.

Example program (C#): e04xae.cs

Example program results: e04xae.r

See Also