Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_correg_robustm_corr_user_deriv (g02hl)

## Purpose

nag_correg_robustm_corr_user_deriv (g02hl) calculates a robust estimate of the covariance matrix for user-supplied weight functions and their derivatives.

## Syntax

[user, cov, a, wt, theta, nit, ifail] = g02hl(ucv, indm, x, a, theta, 'user', user, 'n', n, 'm', m, 'bl', bl, 'bd', bd, 'maxit', maxit, 'nitmon', nitmon, 'tol', tol)
[user, cov, a, wt, theta, nit, ifail] = nag_correg_robustm_corr_user_deriv(ucv, indm, x, a, theta, 'user', user, 'n', n, 'm', m, 'bl', bl, 'bd', bd, 'maxit', maxit, 'nitmon', nitmon, 'tol', tol)
Note: the interface to this routine has changed since earlier releases of the toolbox:
Mark 22: n has been made optional
Mark 23: nitmon, tol now optional
.

## Description

For a set of n$n$ observations on m$m$ variables in a matrix X$X$, a robust estimate of the covariance matrix, C$C$, and a robust estimate of location, θ$\theta$, are given by:
 C = τ2(ATA) − 1, $C=τ2(ATA)-1,$
where τ2${\tau }^{2}$ is a correction factor and A$A$ is a lower triangular matrix found as the solution to the following equations.
 zi = A(xi − θ) $zi=A(xi-θ)$
 n 1/n ∑ w(‖zi‖2)zi = 0 i = 1
$1n ∑i= 1nw(‖zi‖2)zi=0$
and
 n 1/n ∑ u(‖zi‖2)ziziT − v(‖zi‖2)I = 0, i = 1
$1n∑i=1nu(‖zi‖2)zi ziT -v(‖zi‖2)I=0,$
 where xi${x}_{i}$ is a vector of length m$m$ containing the elements of the i$i$th row of X$X$, zi${z}_{i}$ is a vector of length m$m$, I$I$ is the identity matrix and 0$0$ is the zero matrix, and w$w$ and u$u$ are suitable functions.
nag_correg_robustm_corr_user_deriv (g02hl) covers two situations:
 (i) v(t) = 1$v\left(t\right)=1$ for all t$t$, (ii) v(t) = u(t)$v\left(t\right)=u\left(t\right)$.
The robust covariance matrix may be calculated from a weighted sum of squares and cross-products matrix about θ$\theta$ using weights wti = u(zi)${\mathit{wt}}_{i}=u\left(‖{z}_{i}‖\right)$. In case (i) a divisor of n$n$ is used and in case (ii) a divisor of i = 1nwti$\sum _{i=1}^{n}{\mathit{wt}}_{i}$ is used. If w( . ) = sqrt(u( . ))$w\left(.\right)=\sqrt{u\left(.\right)}$, then the robust covariance matrix can be calculated by scaling each row of X$X$ by sqrt(wti)$\sqrt{{\mathit{wt}}_{i}}$ and calculating an unweighted covariance matrix about θ$\theta$.
In order to make the estimate asymptotically unbiased under a Normal model a correction factor, τ2${\tau }^{2}$, is needed. The value of the correction factor will depend on the functions employed (see Huber (1981) and Marazzi (1987)).
nag_correg_robustm_corr_user_deriv (g02hl) finds A$A$ using the iterative procedure as given by Huber.
 Ak = (Sk + I)Ak − 1 $Ak=(Sk+I)Ak-1$
and
 θjk = (bj)/(D1) + θjk − 1, $θjk=bjD1+θjk- 1,$
where Sk = (sjl)${S}_{k}=\left({s}_{jl}\right)$, for j = 1,2,,m$\mathit{j}=1,2,\dots ,m$ and l = 1,2,,m$\mathit{l}=1,2,\dots ,m$, is a lower triangular matrix such that:
sjl =
 { − min [max (hjl / D3, − BL),BL], j > l − min [max ((hjj / (2D3 − D4 / D2)), − BD),BD], j = l
,
$sjl={ -min[max(hjl/D3,-BL),BL], j>l -min[max((hjj/(2D3-D4/D2)),-BD),BD], j=l ,$
where
• D1 = i = 1n {w(zi2) + 1/mw(zi2)zi2} ${D}_{1}=\sum _{i=1}^{n}\left\{w\left({‖{z}_{i}‖}_{2}\right)+\frac{1}{m}{w}^{\prime }\left({‖{z}_{i}‖}_{2}\right){‖{z}_{i}‖}_{2}\right\}$
• D2 = i = 1n {1/p(u(zi2)zi2 + 2u(zi2))zi2v(zi2)} zi2${D}_{2}=\sum _{i=1}^{n}\left\{\frac{1}{p}\left({u}^{\prime }\left({‖{z}_{i}‖}_{2}\right){‖{z}_{i}‖}_{2}+2u\left({‖{z}_{i}‖}_{2}\right)\right){‖{z}_{i}‖}_{2}-{v}^{\prime }\left({‖{z}_{i}‖}_{2}\right)\right\}{‖{z}_{i}‖}_{2}$
• D3 = 1/(m + 2)i = 1n {1/m(u(zi2)zi2 + 2u(zi2)) + u(zi2)} zi22${D}_{3}=\frac{1}{m+2}\sum _{i=1}^{n}\left\{\frac{1}{m}\left({u}^{\prime }\left({‖{z}_{i}‖}_{2}\right){‖{z}_{i}‖}_{2}+2u\left({‖{z}_{i}‖}_{2}\right)\right)+u\left({‖{z}_{i}‖}_{2}\right)\right\}{{‖{z}_{i}‖}_{2}}^{2}$
• D4 = i = 1n {1/mu(zi2)zi22v(zi22)} ${D}_{4}=\sum _{i=1}^{n}\left\{\frac{1}{m}u\left({‖{z}_{i}‖}_{2}\right){{‖{z}_{i}‖}_{2}}^{2}-v\left({{‖{z}_{i}‖}_{2}}^{2}\right)\right\}$
• hjl = i = 1nu(zi2)zijzil${h}_{jl}=\sum _{i=1}^{n}u\left({‖{z}_{i}‖}_{2}\right){z}_{ij}{z}_{il}$, for j > l$j>l$
• hjj = i = 1nu(zi2)(zij2zij22 / m)${h}_{jj}=\sum _{i=1}^{n}u\left({‖{z}_{i}‖}_{2}\right)\left({z}_{ij}^{2}-{{‖{z}_{ij}‖}_{2}}^{2}/m\right)$
• bj = i = 1nw(zi2)(xijbj)${b}_{j}=\sum _{i=1}^{n}w\left({‖{z}_{i}‖}_{2}\right)\left({x}_{ij}-{b}_{j}\right)$
• and BD$\mathit{BD}$ and BL$\mathit{BL}$ are suitable bounds.
nag_correg_robustm_corr_user_deriv (g02hl) is based on routines in ROBETH; see Marazzi (1987)

## References

Huber P J (1981) Robust Statistics Wiley
Marazzi A (1987) Weights for bounded influence regression in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 3 Institut Universitaire de Médecine Sociale et Préventive, Lausanne

## Parameters

### Compulsory Input Parameters

1:     ucv – function handle or string containing name of m-file
ucv must return the values of the functions u$u$ and w$w$ and their derivatives for a given value of its argument.
[user, u, ud, w, wd] = ucv(t, user)

Input Parameters

1:     t – double scalar
The argument for which the functions u$u$ and w$w$ must be evaluated.
2:     user – Any MATLAB object
ucv is called from nag_correg_robustm_corr_user_deriv (g02hl) with the object supplied to nag_correg_robustm_corr_user_deriv (g02hl).

Output Parameters

1:     user – Any MATLAB object
2:     u – double scalar
The value of the u$u$ function at the point t.
3:     ud – double scalar
The value of the derivative of the u$u$ function at the point t.
4:     w – double scalar
The value of the w$w$ function at the point t.
5:     wd – double scalar
The value of the derivative of the w$w$ function at the point t.
2:     indm – int64int32nag_int scalar
Indicates which form of the function v$v$ will be used.
indm = 1${\mathbf{indm}}=1$
v = 1$v=1$.
indm1${\mathbf{indm}}\ne 1$
v = u$v=u$.
3:     x(ldx,m) – double array
ldx, the first dimension of the array, must satisfy the constraint ldxn$\mathit{ldx}\ge {\mathbf{n}}$.
x(i,j)${\mathbf{x}}\left(\mathit{i},\mathit{j}\right)$ must contain the i$\mathit{i}$th observation on the j$\mathit{j}$th variable, for i = 1,2,,n$\mathit{i}=1,2,\dots ,n$ and j = 1,2,,m$\mathit{j}=1,2,\dots ,m$.
4:     a(m × (m + 1) / 2${\mathbf{m}}×\left({\mathbf{m}}+1\right)/2$) – double array
An initial estimate of the lower triangular real matrix A$A$. Only the lower triangular elements must be given and these should be stored row-wise in the array.
The diagonal elements must be 0$\text{}\ne 0$, and in practice will usually be > 0$\text{}>0$. If the magnitudes of the columns of X$X$ are of the same order, the identity matrix will often provide a suitable initial value for A$A$. If the columns of X$X$ are of different magnitudes, the diagonal elements of the initial value of A$A$ should be approximately inversely proportional to the magnitude of the columns of X$X$.
Constraint: a(j × (j1) / 2 + j)0.0${\mathbf{a}}\left(\mathit{j}×\left(\mathit{j}-1\right)/2+\mathit{j}\right)\ne 0.0$, for j = 1,2,,m$\mathit{j}=1,2,\dots ,m$.
5:     theta(m) – double array
m, the dimension of the array, must satisfy the constraint 1mn$1\le {\mathbf{m}}\le {\mathbf{n}}$.
An initial estimate of the location parameter, θj${\theta }_{\mathit{j}}$, for j = 1,2,,m$\mathit{j}=1,2,\dots ,m$.
In many cases an initial estimate of θj = 0${\theta }_{\mathit{j}}=0$, for j = 1,2,,m$\mathit{j}=1,2,\dots ,m$, will be adequate. Alternatively medians may be used as given by nag_univar_robust_1var_median (g07da).

### Optional Input Parameters

1:     user – Any MATLAB object
user is not used by nag_correg_robustm_corr_user_deriv (g02hl), but is passed to ucv. Note that for large objects it may be more efficient to use a global variable which is accessible from the m-files than to use user.
2:     n – int64int32nag_int scalar
Default: The first dimension of the array x.
n$n$, the number of observations.
Constraint: n > 1${\mathbf{n}}>1$.
3:     m – int64int32nag_int scalar
Default: The dimension of the array theta and the second dimension of the array x. (An error is raised if these dimensions are not equal.)
m$m$, the number of columns of the matrix X$X$, i.e., number of independent variables.
Constraint: 1mn$1\le {\mathbf{m}}\le {\mathbf{n}}$.
4:     bl – double scalar
The magnitude of the bound for the off-diagonal elements of Sk${S}_{k}$, BL$BL$.
Default: 0.9$0.9$
Constraint: bl > 0.0${\mathbf{bl}}>0.0$.
5:     bd – double scalar
The magnitude of the bound for the diagonal elements of Sk${S}_{k}$, BD$BD$.
Default: 0.9$0.9$
Constraint: bd > 0.0${\mathbf{bd}}>0.0$.
6:     maxit – int64int32nag_int scalar
The maximum number of iterations that will be used during the calculation of A$A$.
Default: 150$150$
Constraint: maxit > 0${\mathbf{maxit}}>0$.
7:     nitmon – int64int32nag_int scalar
Indicates the amount of information on the iteration that is printed.
nitmon > 0${\mathbf{nitmon}}>0$
The value of A$A$, θ$\theta$ and δ$\delta$ (see Section [Accuracy]) will be printed at the first and every nitmon iterations.
nitmon0${\mathbf{nitmon}}\le 0$
No iteration monitoring is printed.
When printing occurs the output is directed to the current advisory message unit (see nag_file_set_unit_advisory (x04ab)).
Default: 0$0$
8:     tol – double scalar
The relative precision for the final estimates of the covariance matrix. Iteration will stop when maximum δ$\delta$ (see Section [Accuracy]) is less than tol.
Default: 5e-5$5e-5$
Constraint: tol > 0.0${\mathbf{tol}}>0.0$.

ruser ldx wk

### Output Parameters

1:     user – Any MATLAB object
2:     cov(m × (m + 1) / 2${\mathbf{m}}×\left({\mathbf{m}}+1\right)/2$) – double array
Contains a robust estimate of the covariance matrix, C$C$. The upper triangular part of the matrix C$C$ is stored packed by columns (lower triangular stored by rows), Cij${C}_{ij}$ is returned in cov((j × (j1) / 2 + i))${\mathbf{cov}}\left(\left(j×\left(j-1\right)/2+i\right)\right)$, ij$i\le j$.
3:     a(m × (m + 1) / 2${\mathbf{m}}×\left({\mathbf{m}}+1\right)/2$) – double array
The lower triangular elements of the inverse of the matrix A$A$, stored row-wise.
4:     wt(n) – double array
wt(i)${\mathbf{wt}}\left(\mathit{i}\right)$ contains the weights, wti = u(zi2)${\mathit{wt}}_{\mathit{i}}=u\left({‖{z}_{\mathit{i}}‖}_{2}\right)$, for i = 1,2,,n$\mathit{i}=1,2,\dots ,n$.
5:     theta(m) – double array
Contains the robust estimate of the location parameter, θj${\theta }_{\mathit{j}}$, for j = 1,2,,m$\mathit{j}=1,2,\dots ,m$.
6:     nit – int64int32nag_int scalar
The number of iterations performed.
7:     ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

ifail = 1${\mathbf{ifail}}=1$
 On entry, n ≤ 1${\mathbf{n}}\le 1$, or m < 1${\mathbf{m}}<1$, or n < m${\mathbf{n}}<{\mathbf{m}}$, or ldx < n$\mathit{ldx}<{\mathbf{n}}$.
ifail = 2${\mathbf{ifail}}=2$
 On entry, tol ≤ 0.0${\mathbf{tol}}\le 0.0$, or maxit ≤ 0${\mathbf{maxit}}\le 0$, or diagonal element of a = 0.0${\mathbf{a}}=0.0$, or bl ≤ 0.0${\mathbf{bl}}\le 0.0$, or bd ≤ 0.0${\mathbf{bd}}\le 0.0$.
ifail = 3${\mathbf{ifail}}=3$
A column of x has a constant value.
ifail = 4${\mathbf{ifail}}=4$
Value of u or w returned by ucv < 0${\mathbf{ucv}}<0$.
ifail = 5${\mathbf{ifail}}=5$
The function has failed to converge in maxit iterations.
W ifail = 6${\mathbf{ifail}}=6$
One of the following is zero: D1${D}_{1}$, D2${D}_{2}$ or D3${D}_{3}$.
This may be caused by the functions u$u$ or w$w$ being too strict for the current estimate of A$A$ (or C$C$). You should try either a larger initial estimate of A$A$ or make u$u$ and w$w$ less strict.

## Accuracy

On successful exit the accuracy of the results is related to the value of tol; see Section [Parameters]. At an iteration let
 (i) d1 = $d1=\text{}$ the maximum value of |sjl|$|{s}_{jl}|$ (ii) d2 = $d2=\text{}$ the maximum absolute change in wt(i)$wt\left(i\right)$ (iii) d3 = $d3=\text{}$ the maximum absolute relative change in θj${\theta }_{j}$
and let δ = max (d1,d2,d3)$\delta =\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(d1,d2,d3\right)$. Then the iterative procedure is assumed to have converged when δ < tol$\delta <{\mathbf{tol}}$

The existence of A$A$ will depend upon the function u$u$ (see Marazzi (1987)); also if X$X$ is not of full rank a value of A$A$ will not be found. If the columns of X$X$ are almost linearly related, then convergence will be slow.

## Example

```function nag_correg_robustm_corr_user_deriv_example
indm = int64(1);
x = [3.4, 6.9, 12.2;
6.4, 2.5, 15.1;
4.9, 5.5, 14.2;
7.3, 1.9, 18.2;
8.8, 3.6, 11.7;
8.4, 1.3, 17.9;
5.3, 3.1, 15;
2.7, 8.1, 7.7;
6.1, 3, 21.9;
5.3, 2.2, 13.9];
a = [1;
0;
1;
0;
0;
1];
theta = [0;
0;
0];
user = [4,2];
[user, covar, aOut, wt, thetaOut, nit, ifail] = ...
nag_correg_robustm_corr_user_deriv(@ucv, indm, x, a, theta, 'user', user)

function [user, u, ud, w, wd] = ucv(t, user)
cu = user(1);
u = 1.0;
ud = 0.0;
if (t ~= 0)
t2 = t*t;
if (t2 > cu)
u = cu/t2;
ud = -2.0*u/t;
end
end
% w function and derivative
cw = user(2);
if (t > cw)
w = cw/t;
wd = -w/t;
else
w = 1.0;
wd = 0.0;
end
```
```

user =

4     2

covar =

3.2778
-3.6918
5.2841
4.7391
-6.4086
11.8371

aOut =

0.5523
1.0614
0.9424
-0.1881
0.4776
0.5021

wt =

1.0000
1.0000
1.0000
1.0000
0.2339
1.0000
1.0000
0.9385
0.4012
0.7579

thetaOut =

5.6998
3.8636
14.7036

nit =

25

ifail =

0

```
```function g02hl_example
indm = int64(1);
x = [3.4, 6.9, 12.2;
6.4, 2.5, 15.1;
4.9, 5.5, 14.2;
7.3, 1.9, 18.2;
8.8, 3.6, 11.7;
8.4, 1.3, 17.9;
5.3, 3.1, 15;
2.7, 8.1, 7.7;
6.1, 3, 21.9;
5.3, 2.2, 13.9];
a = [1;
0;
1;
0;
0;
1];
theta = [0;
0;
0];
user = [4,2];
[user, covar, aOut, wt, thetaOut, nit, ifail] = ...
g02hl(@ucv, indm, x, a, theta, 'user', user)

function [user, u, ud, w, wd] = ucv(t, user)
cu = user(1);
u = 1.0;
ud = 0.0;
if (t ~= 0)
t2 = t*t;
if (t2 > cu)
u = cu/t2;
ud = -2.0*u/t;
end
end
% w function and derivative
cw = user(2);
if (t > cw)
w = cw/t;
wd = -w/t;
else
w = 1.0;
wd = 0.0;
end
```
```

user =

4     2

covar =

3.2778
-3.6918
5.2841
4.7391
-6.4086
11.8371

aOut =

0.5523
1.0614
0.9424
-0.1881
0.4776
0.5021

wt =

1.0000
1.0000
1.0000
1.0000
0.2339
1.0000
1.0000
0.9385
0.4012
0.7579

thetaOut =

5.6998
3.8636
14.7036

nit =

25

ifail =

0

```

Chapter Contents
Chapter Introduction
NAG Toolbox

© The Numerical Algorithms Group Ltd, Oxford, UK. 2009–2013