G02 Chapter Contents
G02 Chapter Introduction
NAG Library Manual

# NAG Library Routine DocumentG02BYF

Note:  before using this routine, please read the Users' Note for your implementation to check the interpretation of bold italicised terms and other implementation-dependent details.

## 1  Purpose

G02BYF computes a partial correlation/variance-covariance matrix from a correlation or variance-covariance matrix computed by G02BXF.

## 2  Specification

 SUBROUTINE G02BYF ( M, NY, NX, ISZ, R, LDR, P, LDP, WK, IFAIL)
 INTEGER M, NY, NX, ISZ(M), LDR, LDP, IFAIL REAL (KIND=nag_wp) R(LDR,M), P(LDP,NY), WK(NY*NX+NX*(NX+1)/2)

## 3  Description

Partial correlation can be used to explore the association between pairs of random variables in the presence of other variables. For three variables, ${y}_{1}$, ${y}_{2}$ and ${x}_{3}$, the partial correlation coefficient between ${y}_{1}$ and ${y}_{2}$ given ${x}_{3}$ is computed as:
 $r12-r13r23 1-r1321-r232 ,$
where ${r}_{ij}$ is the product-moment correlation coefficient between variables with subscripts $i$ and $j$. The partial correlation coefficient is a measure of the linear association between ${y}_{1}$ and ${y}_{2}$ having eliminated the effect due to both ${y}_{1}$ and ${y}_{2}$ being linearly associated with ${x}_{3}$. That is, it is a measure of association between ${y}_{1}$ and ${y}_{2}$ conditional upon fixed values of ${x}_{3}$. Like the full correlation coefficients the partial correlation coefficient takes a value in the range ($-1,1$) with the value $0$ indicating no association.
In general, let a set of variables be partitioned into two groups $Y$ and $X$ with ${n}_{y}$ variables in $Y$ and ${n}_{x}$ variables in $X$ and let the variance-covariance matrix of all ${n}_{y}+{n}_{x}$ variables be partitioned into,
 $Σxx Σxy Σyx Σyy .$
The variance-covariance of $Y$ conditional on fixed values of the $X$ variables is given by:
 $Σy∣x=Σyy-ΣyxΣxx -1Σxy.$
The partial correlation matrix is then computed by standardizing ${\Sigma }_{y\mid x}$,
 $diag⁡Σy∣x -12Σy∣xdiag⁡Σy∣x -12.$
To test the hypothesis that a partial correlation is zero under the assumption that the data has an approximately Normal distribution a test similar to the test for the full correlation coefficient can be used. If $r$ is the computed partial correlation coefficient then the appropriate $t$ statistic is
 $r⁢n-nx-2 1-r2 ,$
which has approximately a Student's $t$-distribution with $n-{n}_{x}-2$ degrees of freedom, where $n$ is the number of observations from which the full correlation coefficients were computed.

## 4  References

Krzanowski W J (1990) Principles of Multivariate Analysis Oxford University Press
Morrison D F (1967) Multivariate Statistical Methods McGraw–Hill
Osborn J F (1979) Statistical Exercises in Medical Research Blackwell
Snedecor G W and Cochran W G (1967) Statistical Methods Iowa State University Press

## 5  Parameters

1:     M – INTEGERInput
On entry: the number of variables in the variance-covariance/correlation matrix given in R.
Constraint: ${\mathbf{M}}\ge 3$.
2:     NY – INTEGERInput
On entry: the number of $Y$ variables, ${n}_{y}$, for which partial correlation coefficients are to be computed.
Constraint: ${\mathbf{NY}}\ge 2$.
3:     NX – INTEGERInput
On entry: the number of $X$ variables, ${n}_{x}$, which are to be considered as fixed.
Constraints:
• ${\mathbf{NX}}\ge 1$;
• ${\mathbf{NY}}+{\mathbf{NX}}\le {\mathbf{M}}$.
4:     ISZ(M) – INTEGER arrayInput
On entry: indicates which variables belong to set $X$ and $Y$.
${\mathbf{ISZ}}\left(i\right)<0$
The $\mathit{i}$th variable is a $Y$ variable, for $\mathit{i}=1,2,\dots ,{\mathbf{M}}$.
${\mathbf{ISZ}}\left(i\right)>0$
The $i$th variable is a $X$ variable.
${\mathbf{ISZ}}\left(i\right)=0$
The $i$th variable is not included in the computations.
Constraints:
• exactly NY elements of ISZ must be $\text{}<0$;
• exactly NX elements of ISZ must be $\text{}>0$.
5:     R(LDR,M) – REAL (KIND=nag_wp) arrayInput
On entry: the variance-covariance or correlation matrix for the M variables as given by G02BXF. Only the upper triangle need be given.
Note:  the matrix must be a full rank variance-covariance or correlation matrix and so be positive definite. This condition is not directly checked by the routine.
6:     LDR – INTEGERInput
On entry: the first dimension of the array R as declared in the (sub)program from which G02BYF is called.
Constraint: ${\mathbf{LDR}}\ge {\mathbf{M}}$.
7:     P(LDP,NY) – REAL (KIND=nag_wp) arrayOutput
On exit: the strict upper triangle of P contains the strict upper triangular part of the ${n}_{y}$ by ${n}_{y}$ partial correlation matrix. The lower triangle contains the lower triangle of the ${n}_{y}$ by ${n}_{y}$ partial variance-covariance matrix if the matrix given in R is a variance-covariance matrix. If the matrix given in R is a partial correlation matrix then the variance-covariance matrix is for standardized variables.
8:     LDP – INTEGERInput
On entry: the first dimension of the array P as declared in the (sub)program from which G02BYF is called.
Constraint: ${\mathbf{LDP}}\ge {\mathbf{NY}}$.
9:     WK(${\mathbf{NY}}×{\mathbf{NX}}+{\mathbf{NX}}×\left({\mathbf{NX}}+1\right)/2$) – REAL (KIND=nag_wp) arrayWorkspace
10:   IFAIL – INTEGERInput/Output
On entry: IFAIL must be set to $0$, $-1\text{​ or ​}1$. If you are unfamiliar with this parameter you should refer to Section 3.3 in the Essential Introduction for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value $-1\text{​ or ​}1$ is recommended. If the output of error messages is undesirable, then the value $1$ is recommended. Otherwise, if you are not familiar with this parameter, the recommended value is $0$. When the value $-\mathbf{1}\text{​ or ​}\mathbf{1}$ is used it is essential to test the value of IFAIL on exit.
On exit: ${\mathbf{IFAIL}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see Section 6).

## 6  Error Indicators and Warnings

If on entry ${\mathbf{IFAIL}}={\mathbf{0}}$ or $-{\mathbf{1}}$, explanatory error messages are output on the current error message unit (as defined by X04AAF).
Errors or warnings detected by the routine:
${\mathbf{IFAIL}}=1$
 On entry, ${\mathbf{M}}<3$, or ${\mathbf{NY}}<2$, or ${\mathbf{NX}}<1$, or ${\mathbf{NY}}+{\mathbf{NX}}>{\mathbf{M}}$, or ${\mathbf{LDR}}<{\mathbf{M}}$, or ${\mathbf{LDP}}<{\mathbf{NY}}$.
${\mathbf{IFAIL}}=2$
 On entry, there are not exactly NY elements of ${\mathbf{ISZ}}<0$, or there are not exactly NX elements of ${\mathbf{ISZ}}>0$.
${\mathbf{IFAIL}}=3$
On entry, the variance-covariance/correlation matrix of the $X$ variables, ${\Sigma }_{xx}$, is not of full rank. Try removing some of the $X$ variables by setting the appropriate element of ${\mathbf{ISZ}}=0$.
${\mathbf{IFAIL}}=4$
Either a diagonal element of the partial variance-covariance matrix, ${\Sigma }_{y\mid x}$, is zero and/or a computed partial correlation coefficient is greater than one. Both indicate that the matrix input in R was not positive definite.

## 7  Accuracy

G02BYF computes the partial variance-covariance matrix, ${\Sigma }_{y\mid x}$, by computing the Cholesky factorization of ${\Sigma }_{xx}$. If ${\Sigma }_{xx}$ is not of full rank the computation will fail. For a statement on the accuracy of the Cholesky factorization see F07GDF (DPPTRF).

Models that represent the linear associations given by partial correlations can be fitted using the multiple regression routine G02DAF.

## 9  Example

Data, given by Osborn (1979), on the number of deaths, smoke ($\mathrm{mg}/{\mathrm{m}}^{3}$) and sulphur dioxide (parts/million) during an intense period of fog is input. The correlations are computed using G02BXF and the partial correlation between deaths and smoke given sulphur dioxide is computed using G02BYF. Both correlation matrices are printed using the routine X04CAF.

### 9.1  Program Text

Program Text (g02byfe.f90)

### 9.2  Program Data

Program Data (g02byfe.d)

### 9.3  Program Results

Program Results (g02byfe.r)