Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_correg_pls_fit (g02lc)

## Purpose

nag_correg_pls_fit (g02lc) calculates parameter estimates for a given number of factors given the output from an orthogonal scores PLS regression (nag_correg_pls_svd (g02la) or nag_correg_pls_wold (g02lb)).

## Syntax

[b, ob, vip, ifail] = g02lc(nfact, p, c, w, rcond, orig, xbar, ybar, iscale, xstd, ystd, vipopt, ycv, 'ip', ip, 'my', my, 'maxfac', maxfac)
[b, ob, vip, ifail] = nag_correg_pls_fit(nfact, p, c, w, rcond, orig, xbar, ybar, iscale, xstd, ystd, vipopt, ycv, 'ip', ip, 'my', my, 'maxfac', maxfac)

## Description

The parameter estimates B$B$ for a l$l$-factor orthogonal scores PLS model with m$m$ predictor variables and r$r$ response variables are given by,
 B = W (PTW) − 1 CT ,   B ∈ ℝm × r , $B=W (PTW)-1 CT , B∈ ℝm×r ,$
where W$W$ is the m$m$ by k$k$ (l$\ge l$) matrix of x$x$-weights; P$P$ is the m$m$ by k$k$ matrix of x$x$-loadings; and C$C$ is the r$r$ by k$k$ matrix of y$y$-loadings for a fitted PLS model.
The parameter estimates B$B$ are for centred, and possibly scaled, predictor data X1${X}_{1}$ and response data Y1${Y}_{1}$. Parameter estimates may also be given for the predictor data X$X$ and response data Y$Y$.
Optionally, nag_correg_pls_fit (g02lc) will calculate variable influence on projection (VIP) statistics, see Wold (1994).

## References

Wold S (1994) PLS for multivariate linear modelling QSAR: chemometric methods in molecular design Methods and Principles in Medicinal Chemistry (ed van de Waterbeemd H) Verlag-Chemie

## Parameters

### Compulsory Input Parameters

1:     nfact – int64int32nag_int scalar
l$l$, the number of factors to include in the calculation of parameter estimates.
Constraint: 1nfactmaxfac$1\le {\mathbf{nfact}}\le {\mathbf{maxfac}}$.
2:     p(ldp,maxfac) – double array
ldp, the first dimension of the array, must satisfy the constraint ldpip$\mathit{ldp}\ge {\mathbf{ip}}$.
x$x$-loadings as returned from nag_correg_pls_svd (g02la) and nag_correg_pls_wold (g02lb).
3:     c(ldc,maxfac) – double array
ldc, the first dimension of the array, must satisfy the constraint ldcmy$\mathit{ldc}\ge {\mathbf{my}}$.
y$y$-loadings as returned from nag_correg_pls_svd (g02la) and nag_correg_pls_wold (g02lb).
4:     w(ldw,maxfac) – double array
ldw, the first dimension of the array, must satisfy the constraint ldwip$\mathit{ldw}\ge {\mathbf{ip}}$.
x$x$-weights as returned from nag_correg_pls_svd (g02la) and nag_correg_pls_wold (g02lb).
5:     rcond – double scalar
Singular values of PTW${P}^{\mathrm{T}}W$ less than rcond times the maximum singular value are treated as zero when calculating parameter estimates. If rcond is negative, a value of 0.005$0.005$ is used.
6:     orig – int64int32nag_int scalar
Indicates how parameter estimates are calculated.
orig = -1${\mathbf{orig}}=-1$
Parameter estimates for the centered, and possibly, scaled data.
orig = 1${\mathbf{orig}}=1$
Parameter estimates for the original data.
Constraint: orig = -1${\mathbf{orig}}=-1$ or 1$1$.
7:     xbar(ip) – double array
ip, the dimension of the array, must satisfy the constraint ip > 1${\mathbf{ip}}>1$.
If orig = 1${\mathbf{orig}}=1$, mean values of predictor variables in the model; otherwise xbar is not referenced.
8:     ybar(my) – double array
my, the dimension of the array, must satisfy the constraint my1${\mathbf{my}}\ge 1$.
If orig = 1${\mathbf{orig}}=1$, mean value of each response variable in the model; otherwise ybar is not referenced.
9:     iscale – int64int32nag_int scalar
If orig = 1${\mathbf{orig}}=1$, iscale must take the value supplied to either nag_correg_pls_svd (g02la) or nag_correg_pls_wold (g02lb); otherwise iscale is not referenced.
Constraint: if orig = 1${\mathbf{orig}}=1$, iscale = -1${\mathbf{iscale}}=-1$, 1$1$ or 2$2$.
10:   xstd(ip) – double array
ip, the dimension of the array, must satisfy the constraint ip > 1${\mathbf{ip}}>1$.
If orig = 1${\mathbf{orig}}=1$ and iscale-1${\mathbf{iscale}}\ne -1$, the scalings of predictor variables in the model as returned from either nag_correg_pls_svd (g02la) or nag_correg_pls_wold (g02lb); otherwise xstd is not referenced.
11:   ystd(my) – double array
my, the dimension of the array, must satisfy the constraint my1${\mathbf{my}}\ge 1$.
If orig = 1${\mathbf{orig}}=1$ and iscale-1${\mathbf{iscale}}\ne -1$, the scalings of response variables as returned from either nag_correg_pls_svd (g02la) or nag_correg_pls_wold (g02lb); otherwise ystd is not referenced.
12:   vipopt – int64int32nag_int scalar
A flag that determines variable influence on projections (VIP) options.
vipopt = 0${\mathbf{vipopt}}=0$
VIP are not calculated.
vipopt = 1${\mathbf{vipopt}}=1$
VIP are calculated for predictor variables using the mean explained variance in responses.
${\mathbf{vipopt}}={\mathbf{my}}$
VIP are calculated for predictor variables for each response variable in the model.
Note that setting ${\mathbf{vipopt}}={\mathbf{my}}$ when my = 1${\mathbf{my}}=1$ gives the same result as setting vipopt = 1${\mathbf{vipopt}}=1$ directly.
Constraint: vipopt = 0${\mathbf{vipopt}}=0$, 1$1$ or my${\mathbf{my}}$.
13:   ycv(ldycv,my) – double array
ldycv, the first dimension of the array, must satisfy the constraint if vipopt0${\mathbf{vipopt}}\ne 0$, ldycvnfact$\mathit{ldycv}\ge {\mathbf{nfact}}$.
If vipopt0${\mathbf{vipopt}}\ne 0$, ycv(i,j)${\mathbf{ycv}}\left(\mathit{i},\mathit{j}\right)$ is the cumulative percentage of variance of the j$\mathit{j}$th response variable explained by the first i$\mathit{i}$ factors, for i = 1,2,,nfact$\mathit{i}=1,2,\dots ,{\mathbf{nfact}}$ and j = 1,2,,my$\mathit{j}=1,2,\dots ,{\mathbf{my}}$; otherwise ycv is not referenced.

### Optional Input Parameters

1:     ip – int64int32nag_int scalar
Default: The dimension of the arrays xbar, xstd and the first dimension of the arrays p, w. (An error is raised if these dimensions are not equal.)
m$m$, the number of predictor variables in the fitted model.
Constraint: ip > 1${\mathbf{ip}}>1$.
2:     my – int64int32nag_int scalar
Default: The dimension of the arrays ybar, ystd and the first dimension of the array c and the second dimension of the array ycv. (An error is raised if these dimensions are not equal.)
r$r$, the number of response variables.
Constraint: my1${\mathbf{my}}\ge 1$.
3:     maxfac – int64int32nag_int scalar
Default: The second dimension of the arrays p, c, w. (An error is raised if these dimensions are not equal.)
k$k$, the number of factors available in the PLS model.
Constraint: 1maxfacip$1\le {\mathbf{maxfac}}\le {\mathbf{ip}}$.

### Input Parameters Omitted from the MATLAB Interface

ldp ldc ldw ldb ldob ldycv ldvip

### Output Parameters

1:     b(ldb,my) – double array
ldbip$\mathit{ldb}\ge {\mathbf{ip}}$.
b(i,j)${\mathbf{b}}\left(\mathit{i},\mathit{j}\right)$ contains the parameter estimate for the i$\mathit{i}$th predictor variable in the model for the j$\mathit{j}$th response variable, for i = 1,2,,ip$\mathit{i}=1,2,\dots ,{\mathbf{ip}}$ and j = 1,2,,my$\mathit{j}=1,2,\dots ,{\mathbf{my}}$.
2:     ob(ldob,my) – double array
If orig = 1${\mathbf{orig}}=1$, ob(1,j)${\mathbf{ob}}\left(1,\mathit{j}\right)$ contains the intercept value for the j$\mathit{j}$th response variable, and ob(i + 1,j)${\mathbf{ob}}\left(\mathit{i}+1,\mathit{j}\right)$ contains the parameter estimate on the original scale for the i$\mathit{i}$th predictor variable in the model, for i = 1,2,,ip$\mathit{i}=1,2,\dots ,{\mathbf{ip}}$ and j = 1,2,,my$\mathit{j}=1,2,\dots ,{\mathbf{my}}$. Otherwise ob is not referenced.
3:     vip(ldvip,vipopt) – double array
If vipopt = 1${\mathbf{vipopt}}=1$, vip(i,1)${\mathbf{vip}}\left(\mathit{i},1\right)$ contains the VIP statistic for the i$\mathit{i}$th predictor variable in the model for all response variables, for i = 1,2,,ip$\mathit{i}=1,2,\dots ,{\mathbf{ip}}$.
If ${\mathbf{vipopt}}={\mathbf{my}}$, vip(i,j)${\mathbf{vip}}\left(\mathit{i},\mathit{j}\right)$ contains the VIP statistic for the i$\mathit{i}$th predictor variable in the model for the j$\mathit{j}$th response variable, for i = 1,2,,ip$\mathit{i}=1,2,\dots ,{\mathbf{ip}}$ and j = 1,2,,my$\mathit{j}=1,2,\dots ,{\mathbf{my}}$.
Otherwise vip is not referenced.
4:     ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Errors or warnings detected by the function:
ifail = 1${\mathbf{ifail}}=1$
 On entry, ip < 2${\mathbf{ip}}<2$, or my < 1${\mathbf{my}}<1$, or orig ≠ -1${\mathbf{orig}}\ne -1$ or 1$1$, or orig = 1${\mathbf{orig}}=1$ and iscale ≠ -1${\mathbf{iscale}}\ne -1$, 1$1$ or 2$2$, or vipopt ≠ 0${\mathbf{vipopt}}\ne 0$, 1$1$ or my${\mathbf{my}}$.
ifail = 2${\mathbf{ifail}}=2$
 On entry, maxfac < 1${\mathbf{maxfac}}<1$ or ${\mathbf{maxfac}}>{\mathbf{ip}}$, or nfact < 1${\mathbf{nfact}}<1$ or ${\mathbf{nfact}}>{\mathbf{maxfac}}$, or ldp < ip$\mathit{ldp}<{\mathbf{ip}}$, or ldc < my$\mathit{ldc}<{\mathbf{my}}$, or ldw < ip$\mathit{ldw}<{\mathbf{ip}}$, or ldb < ip$\mathit{ldb}<{\mathbf{ip}}$, or orig = 1${\mathbf{orig}}=1$ and ldob < ip + 1$\mathit{ldob}<{\mathbf{ip}}+1$, or ldycv < nfact$\mathit{ldycv}<{\mathbf{nfact}}$, or vipopt ≠ 0${\mathbf{vipopt}}\ne 0$ and ldvip < ip$\mathit{ldvip}<{\mathbf{ip}}$.

## Accuracy

The calculations are based on the singular value decomposition of PTW${P}^{\mathrm{T}}W$.

nag_correg_pls_fit (g02lc) allocates internally l(l + r + 4) + max (2l,r)$l\left(l+r+4\right)+\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(2l,r\right)$ elements of double storage.

## Example

```function nag_correg_pls_fit_example
nfact = int64(2);
p = [-0.6708, -1.0047, 0.6505, 0.6169;
0.4943, 0.1355, -0.901, -0.2388;
-0.4167, -1.9983, -0.5538, 0.8474;
0.393, 1.2441, -0.6967, -0.4336;
0.3267, 0.5838, -1.4088, -0.6323;
0.0145, 0.9607, 1.6594, 0.5361;
-2.4471, 0.3532, -1.1321, -1.3554;
3.5198, 0.6005, 0.2191, 0.038;
1.0973, 2.0635, -0.4074, -0.3522;
-2.4466, 2.564, -0.4806, 0.3819;
2.2732, -1.311, -0.7686, -1.8959;
-1.7987, 2.4088, -0.9475, -0.4727;
0.3629, 0.2241, -2.6332, 2.3739;
0.3629, 0.2241, -2.6332, 2.3739;
-0.3629, -0.2241, 2.6332, -2.3739];
c = [3.5425, 1.0475, 0.2548, 0.1866];
w = [-0.15764, -0.15935, 0.17774, 0.054029;
0.08568, -0.0001524, -0.12179, 0.10989;
-0.16931, -0.37431, 0.094348, 0.31878;
0.12153, 0.20589, -0.18144, -0.04461;
0.071133, 0.055884, -0.26916, 0.054912;
0.065188, 0.2417, 0.23365, -0.18849;
-0.42481, -0.0018798, -0.32413, -0.116;
0.6537, 0.16725, 0.21908, 0.25461;
0.28504, 0.36549, -0.19244, -0.1543;
-0.29341, 0.50464, -0.010952, 0.13881;
0.29829, -0.36979, -0.49942, -0.49355;
-0.20313, 0.41952, -0.25684, -0.075647;
0.056905, -0.023197, -0.30503, 0.39673;
0.056905, -0.023197, -0.30503, 0.39673;
-0.056905, 0.023197, 0.30503, -0.39673];
rcond = -1;
orig = int64(1);
xbar = [-2.6137;
-2.3614;
-1.0449;
2.8614;
0.3156;
-0.2641;
-0.3146;
-1.1221;
0.2401;
0.4694;
-1.9619;
0.1691;
2.5664;
1.3741;
-2.7821];
ybar = [0.452];
iscale = int64(1);
xstd = [1.4956;
1.3233;
0.5829;
0.7735;
0.6247;
0.7966;
2.4113;
2.0421;
0.4678;
0.8197;
0.942;
0.1735;
1.0475;
0.1359;
1.3853];
ystd = [0.9062];
vipopt = int64(1);
ycv = [89.638060; 97.476270; 97.939839; 98.188474];
[b, ob, vip, ifail] = ...
nag_correg_pls_fit(nfact, p, c, w, rcond, orig, xbar, ybar, iscale, xstd, ystd, vipopt, ycv)
```
```

b =

-0.1383
0.0572
-0.1906
0.1238
0.0591
0.0936
-0.2842
0.4713
0.2661
-0.0914
0.1226
-0.0488
0.0332
0.0332
-0.0332

ob =

-0.4374
-0.0838
0.0392
-0.2964
0.1451
0.0857
0.1065
-0.1068
0.2091
0.5155
-0.1011
0.1180
-0.2548
0.0287
0.2214
-0.0217

vip =

0.6111
0.3182
0.7513
0.5048
0.2712
0.3593
1.5777
2.4348
1.1322
1.2226
1.1799
0.8840
0.2129
0.2129
0.2129

ifail =

0

```
```function g02lc_example
nfact = int64(2);
p = [-0.6708, -1.0047, 0.6505, 0.6169;
0.4943, 0.1355, -0.901, -0.2388;
-0.4167, -1.9983, -0.5538, 0.8474;
0.393, 1.2441, -0.6967, -0.4336;
0.3267, 0.5838, -1.4088, -0.6323;
0.0145, 0.9607, 1.6594, 0.5361;
-2.4471, 0.3532, -1.1321, -1.3554;
3.5198, 0.6005, 0.2191, 0.038;
1.0973, 2.0635, -0.4074, -0.3522;
-2.4466, 2.564, -0.4806, 0.3819;
2.2732, -1.311, -0.7686, -1.8959;
-1.7987, 2.4088, -0.9475, -0.4727;
0.3629, 0.2241, -2.6332, 2.3739;
0.3629, 0.2241, -2.6332, 2.3739;
-0.3629, -0.2241, 2.6332, -2.3739];
c = [3.5425, 1.0475, 0.2548, 0.1866];
w = [-0.15764, -0.15935, 0.17774, 0.054029;
0.08568, -0.0001524, -0.12179, 0.10989;
-0.16931, -0.37431, 0.094348, 0.31878;
0.12153, 0.20589, -0.18144, -0.04461;
0.071133, 0.055884, -0.26916, 0.054912;
0.065188, 0.2417, 0.23365, -0.18849;
-0.42481, -0.0018798, -0.32413, -0.116;
0.6537, 0.16725, 0.21908, 0.25461;
0.28504, 0.36549, -0.19244, -0.1543;
-0.29341, 0.50464, -0.010952, 0.13881;
0.29829, -0.36979, -0.49942, -0.49355;
-0.20313, 0.41952, -0.25684, -0.075647;
0.056905, -0.023197, -0.30503, 0.39673;
0.056905, -0.023197, -0.30503, 0.39673;
-0.056905, 0.023197, 0.30503, -0.39673];
rcond = -1;
orig = int64(1);
xbar = [-2.6137;
-2.3614;
-1.0449;
2.8614;
0.3156;
-0.2641;
-0.3146;
-1.1221;
0.2401;
0.4694;
-1.9619;
0.1691;
2.5664;
1.3741;
-2.7821];
ybar = [0.452];
iscale = int64(1);
xstd = [1.4956;
1.3233;
0.5829;
0.7735;
0.6247;
0.7966;
2.4113;
2.0421;
0.4678;
0.8197;
0.942;
0.1735;
1.0475;
0.1359;
1.3853];
ystd = [0.9062];
vipopt = int64(1);
ycv = [89.638060; 97.476270; 97.939839; 98.188474];
[b, ob, vip, ifail] = ...
g02lc(nfact, p, c, w, rcond, orig, xbar, ybar, iscale, xstd, ystd, vipopt, ycv)
```
```

b =

-0.1383
0.0572
-0.1906
0.1238
0.0591
0.0936
-0.2842
0.4713
0.2661
-0.0914
0.1226
-0.0488
0.0332
0.0332
-0.0332

ob =

-0.4374
-0.0838
0.0392
-0.2964
0.1451
0.0857
0.1065
-0.1068
0.2091
0.5155
-0.1011
0.1180
-0.2548
0.0287
0.2214
-0.0217

vip =

0.6111
0.3182
0.7513
0.5048
0.2712
0.3593
1.5777
2.4348
1.1322
1.2226
1.1799
0.8840
0.2129
0.2129
0.2129

ifail =

0

```