Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_univar_robust_1var_mestim_wgt (g07dc)

## Purpose

nag_univar_robust_1var_mestim_wgt (g07dc) computes an M$M$-estimate of location with (optional) simultaneous estimation of scale, where you provide the weight functions.

## Syntax

[theta, sigma, rs, nit, wrk, ifail] = g07dc(chi, psi, isigma, x, beta, theta, sigma, tol, 'n', n, 'maxit', maxit)
[theta, sigma, rs, nit, wrk, ifail] = nag_univar_robust_1var_mestim_wgt(chi, psi, isigma, x, beta, theta, sigma, tol, 'n', n, 'maxit', maxit)

## Description

The data consists of a sample of size n$n$, denoted by x1,x2,,xn${x}_{1},{x}_{2},\dots ,{x}_{n}$, drawn from a random variable X$X$.
The xi${x}_{i}$ are assumed to be independent with an unknown distribution function of the form,
 F((xi − θ) / σ) $F((xi-θ)/σ)$
where θ$\theta$ is a location parameter, and σ$\sigma$ is a scale parameter. M$M$-estimators of θ$\theta$ and σ$\sigma$ are given by the solution to the following system of equations;
 n ∑ ψ((xi − θ̂) / σ̂) i = 1
= 0
 n ∑ χ((xi − θ̂) / σ̂) i = 1
= (n1)β
$∑i=1nψ((xi-θ^)/σ^) = 0 ∑i=1nχ((xi-θ^)/σ^) = (n-1)β$
where ψ$\psi$ and χ$\chi$ are user-supplied weight functions, and β$\beta$ is a constant. Optionally the second equation can be omitted and the first equation is solved for θ̂$\stackrel{^}{\theta }$ using an assigned value of σ = σc$\sigma ={\sigma }_{c}$.
The constant β$\beta$ should be chosen so that σ̂$\stackrel{^}{\sigma }$ is an unbiased estimator when xi${x}_{\mathit{i}}$, for i = 1,2,,n$\mathit{i}=1,2,\dots ,n$ has a Normal distribution. To achieve this the value of β$\beta$ is calculated as:
 ∞ β = E(χ) = ∫ χ(z)1/(sqrt(2π))exp{( − z2)/2}dz − ∞
$β=E(χ)=∫-∞∞χ(z)12πexp{-z22}dz$
The values of ψ ((xiθ̂)/(σ̂)) σ̂$\psi \left(\frac{{x}_{i}-\stackrel{^}{\theta }}{\stackrel{^}{\sigma }}\right)\stackrel{^}{\sigma }$ are known as the Winsorized residuals.
The equations are solved by a simple iterative procedure, suggested by Huber:
σ̂k =
 sqrt(1/(β(n − 1)) (n ) ∑ χ((xi − θ̂k − 1)/(σ̂k − 1))i = 1 σ̂k − 12)
$σ^k=1β(n-1) (∑i=1nχ (xi-θ^k-1σ^k-1) ) σ^k-12$
and
 n θ̂k = θ̂k − 1 + 1/n ∑ ψ((xi − θ̂k − 1)/(σ̂k))σ̂k i = 1
$θ^k=θ^k- 1+1n ∑i= 1nψ (xi-θ^k- 1σ^k) σ^k$
or
 σ̂k = σc $σ^k=σc$
if σ$\sigma$ is fixed.
The initial values for θ̂$\stackrel{^}{\theta }$ and σ̂$\stackrel{^}{\sigma }$ may be user-supplied or calculated within nag_univar_robust_1var_mestim (g07db) as the sample median and an estimate of σ$\sigma$ based on the median absolute deviation respectively.
nag_univar_robust_1var_mestim_wgt (g07dc) is based upon function LYHALG within the ROBETH library, see Marazzi (1987).

## References

Hampel F R, Ronchetti E M, Rousseeuw P J and Stahel W A (1986) Robust Statistics. The Approach Based on Influence Functions Wiley
Huber P J (1981) Robust Statistics Wiley
Marazzi A (1987) Subroutines for robust estimation of location and scale in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 1 Institut Universitaire de Médecine Sociale et Préventive, Lausanne

## Parameters

### Compulsory Input Parameters

1:     chi – function handle or string containing name of m-file
chi must return the value of the weight function χ$\chi$ for a given value of its argument. The value of χ$\chi$ must be non-negative.
[result] = chi(t)

Input Parameters

1:     t – double scalar
The argument for which chi must be evaluated.

Output Parameters

1:     result – double scalar
The result of the function.
2:     psi – function handle or string containing name of m-file
psi must return the value of the weight function ψ$\psi$ for a given value of its argument.
[result] = psi(t)

Input Parameters

1:     t – double scalar
The argument for which psi must be evaluated.

Output Parameters

1:     result – double scalar
The result of the function.
3:     isigma – int64int32nag_int scalar
The value assigned to isigma determines whether σ̂$\stackrel{^}{\sigma }$ is to be simultaneously estimated.
isigma = 0${\mathbf{isigma}}=0$
The estimation of σ̂$\stackrel{^}{\sigma }$ is bypassed and sigma is set equal to σc${\sigma }_{c}$.
isigma = 1${\mathbf{isigma}}=1$
σ̂$\stackrel{^}{\sigma }$ is estimated simultaneously.
4:     x(n) – double array
n, the dimension of the array, must satisfy the constraint n > 1${\mathbf{n}}>1$.
The vector of observations, x1,x2,,xn${x}_{1},{x}_{2},\dots ,{x}_{n}$.
5:     beta – double scalar
The value of the constant β$\beta$ of the chosen chi function.
Constraint: beta > 0.0${\mathbf{beta}}>0.0$.
6:     theta – double scalar
If sigma > 0${\mathbf{sigma}}>0$, then theta must be set to the required starting value of the estimate of the location parameter θ̂$\stackrel{^}{\theta }$. A reasonable initial value for θ̂$\stackrel{^}{\theta }$ will often be the sample mean or median.
7:     sigma – double scalar
The role of sigma depends on the value assigned to isigma as follows.
If isigma = 1${\mathbf{isigma}}=1$, sigma must be assigned a value which determines the values of the starting points for the calculation of θ̂$\stackrel{^}{\theta }$ and σ̂$\stackrel{^}{\sigma }$. If sigma0.0${\mathbf{sigma}}\le 0.0$, then nag_univar_robust_1var_mestim_wgt (g07dc) will determine the starting points of θ̂$\stackrel{^}{\theta }$ and σ̂$\stackrel{^}{\sigma }$. Otherwise, the value assigned to sigma will be taken as the starting point for σ̂$\stackrel{^}{\sigma }$, and theta must be assigned a relevant value before entry, see above.
If isigma = 0${\mathbf{isigma}}=0$, sigma must be assigned a value which determines the values of σc${\sigma }_{c}$, which is held fixed during the iterations, and the starting value for the calculation of θ̂$\stackrel{^}{\theta }$. If sigma0${\mathbf{sigma}}\le 0$, then nag_univar_robust_1var_mestim_wgt (g07dc) will determine the value of σc${\sigma }_{c}$ as the median absolute deviation adjusted to reduce bias (see nag_univar_robust_1var_median (g07da)) and the starting point for θ$\theta$. Otherwise, the value assigned to sigma will be taken as the value of σc${\sigma }_{c}$ and theta must be assigned a relevant value before entry, see above.
8:     tol – double scalar
The relative precision for the final estimates. Convergence is assumed when the increments for theta, and sigma are less than tol × max (1.0,σk1)${\mathbf{tol}}×\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(1.0,{\sigma }_{k-1}\right)$.
Constraint: tol > 0.0${\mathbf{tol}}>0.0$.

### Optional Input Parameters

1:     n – int64int32nag_int scalar
Default: The dimension of the array x.
n$n$, the number of observations.
Constraint: n > 1${\mathbf{n}}>1$.
2:     maxit – int64int32nag_int scalar
The maximum number of iterations that should be used during the estimation.
Default: 50$50$
Constraint: maxit > 0${\mathbf{maxit}}>0$.

None.

### Output Parameters

1:     theta – double scalar
The M$M$-estimate of the location parameter θ̂$\stackrel{^}{\theta }$.
2:     sigma – double scalar
The M$M$-estimate of the scale parameter σ̂$\stackrel{^}{\sigma }$, if isigma was assigned the value 1$1$ on entry, otherwise sigma will contain the initial fixed value σc${\sigma }_{c}$.
3:     rs(n) – double array
The Winsorized residuals.
4:     nit – int64int32nag_int scalar
The number of iterations that were used during the estimation.
5:     wrk(n) – double array
If sigma0.0${\mathbf{sigma}}\le 0.0$ on entry, wrk will contain the n$n$ observations in ascending order.
6:     ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Errors or warnings detected by the function:
ifail = 1${\mathbf{ifail}}=1$
 On entry, n ≤ 1${\mathbf{n}}\le 1$, or maxit ≤ 0${\mathbf{maxit}}\le 0$, or tol ≤ 0.0${\mathbf{tol}}\le 0.0$, or isigma ≠ 0${\mathbf{isigma}}\ne 0$ or 1$1$.
ifail = 2${\mathbf{ifail}}=2$
 On entry, beta ≤ 0.0${\mathbf{beta}}\le 0.0$.
ifail = 3${\mathbf{ifail}}=3$
 On entry, all elements of the input array x are equal.
ifail = 4${\mathbf{ifail}}=4$
sigma, the current estimate of σ$\sigma$, is zero or negative. This error exit is very unlikely, although it may be caused by too large an initial value of sigma.
ifail = 5${\mathbf{ifail}}=5$
The number of iterations required exceeds maxit.
ifail = 6${\mathbf{ifail}}=6$
On completion of the iterations, the Winsorized residuals were all zero. This may occur when using the isigma = 0${\mathbf{isigma}}=0$ option with a redescending ψ$\psi$ function, i.e., ψ = 0$\psi =0$ if |t| > τ$|t|>\tau$, for some positive constant τ$\tau$.
If the given value of σ$\sigma$ is too small, then the standardized residuals (xiθ̂k)/(σc) $\frac{{x}_{i}-{\stackrel{^}{\theta }}_{k}}{{\sigma }_{c}}$, will be large and all the residuals may fall into the region for which ψ(t) = 0$\psi \left(t\right)=0$. This may incorrectly terminate the iterations thus making theta and sigma invalid.
Re-enter the function with a larger value of σc${\sigma }_{c}$ or with isigma = 1${\mathbf{isigma}}=1$.
ifail = 7${\mathbf{ifail}}=7$
The value returned by the chi function is negative.

## Accuracy

On successful exit the accuracy of the results is related to the value of tol, see Section [Parameters].

Standard forms of the functions ψ$\psi$ and χ$\chi$ are given in Hampel et al. (1986), Huber (1981) and Marazzi (1987). nag_univar_robust_1var_mestim (g07db) calculates M$M$-estimates using some standard forms for ψ$\psi$ and χ$\chi$.
When you supply the initial values, care has to be taken over the choice of the initial value of σ$\sigma$. If too small a value is chosen then initial values of the standardized residuals (xiθ̂k)/σ $\frac{{x}_{i}-{\stackrel{^}{\theta }}_{k}}{\sigma }$ will be large. If the redescending ψ$\psi$ functions are used, i.e., ψ = 0$\psi =0$ if |t| > τ$|t|>\tau$, for some positive constant τ$\tau$, then these large values are Winsorized as zero. If a sufficient number of the residuals fall into this category then a false solution may be returned, see page 152 of Hampel et al. (1986).

## Example

```function nag_univar_robust_1var_mestim_wgt_example
isigma = int64(1);
x = [13;
11;
16;
5;
3;
18;
9;
8;
6;
27;
7];
beta = 0.3892326;
theta = 0;
sigma = -1;
tol = 0.0001;
[thetaOut, sigmaOut, rs, nit, wrk, ifail] = ...
nag_univar_robust_1var_mestim_wgt(@chi, @psi, isigma, x, beta, theta, sigma, tol)

function [result] = chi(t)
ps = min(1.5, abs(t));
result = ps*ps/2;
function [result] = psi(t)

if abs(t) < 4.5
if abs(t) < 3
result=min(1.5, abs(t));
else
result=1.5*(4.5-abs(t))/1.5;
end
if t < 0
result = -result;
end
else
result=0;
end
```
```

thetaOut =

10.5487

sigmaOut =

6.3247

rs =

2.4513
0.4513
5.4513
-5.5487
-7.5487
7.4513
-1.5487
-2.5487
-4.5487
16.4513
-3.5487

nit =

8

wrk =

3
5
6
7
8
9
11
13
16
18
27

ifail =

0

```
```function g07dc_example
isigma = int64(1);
x = [13;
11;
16;
5;
3;
18;
9;
8;
6;
27;
7];
beta = 0.3892326;
theta = 0;
sigma = -1;
tol = 0.0001;
[thetaOut, sigmaOut, rs, nit, wrk, ifail] = ...
g07dc(@chi, @psi, isigma, x, beta, theta, sigma, tol)

function [result] = chi(t)
ps = min(1.5, abs(t));
result = ps*ps/2;
function [result] = psi(t)

if abs(t) < 4.5
if abs(t) < 3
result=min(1.5, abs(t));
else
result=1.5*(4.5-abs(t))/1.5;
end
if t < 0
result = -result;
end
else
result=0;
end
```
```

thetaOut =

10.5487

sigmaOut =

6.3247

rs =

2.4513
0.4513
5.4513
-5.5487
-7.5487
7.4513
-1.5487
-2.5487
-4.5487
16.4513
-3.5487

nit =

8

wrk =

3
5
6
7
8
9
11
13
16
18
27

ifail =

0

```