NAG Library Routine Document
F04QAF
1 Purpose
F04QAF solves sparse nonsymmetric equations, sparse linear least squares problems and sparse damped linear least squares problems, using a Lanczos algorithm.
2 Specification
SUBROUTINE F04QAF ( 
M, N, B, X, SE, APROD, DAMP, ATOL, BTOL, CONLIM, ITNLIM, MSGLVL, ITN, ANORM, ACOND, RNORM, ARNORM, XNORM, WORK, RUSER, LRUSER, IUSER, LIUSER, INFORM, IFAIL) 
INTEGER 
M, N, ITNLIM, MSGLVL, ITN, LRUSER, IUSER(LIUSER), LIUSER, INFORM, IFAIL 
REAL (KIND=nag_wp) 
B(M), X(N), SE(N), DAMP, ATOL, BTOL, CONLIM, ANORM, ACOND, RNORM, ARNORM, XNORM, WORK(N,2), RUSER(LRUSER) 
EXTERNAL 
APROD 

3 Description
F04QAF can be used to solve a system of linear equations
where
$A$ is an
$n$ by
$n$ sparse nonsymmetric matrix, or can be used to solve linear least squares problems, so that F04QAF minimizes the value
$\rho $ given by
where
$A$ is an
$m$ by
$n$ sparse matrix and
$\Vert r\Vert $ denotes the Euclidean length of
$r$ so that
${\Vert r\Vert}^{2}={r}^{\mathrm{T}}r$. A damping parameter,
$\lambda $, may be included in the least squares problem in which case F04QAF minimizes the value
$\rho $ given by
$\lambda $ is supplied as the parameter
DAMP and should of course be zero if the solution to problems
(1) or
(2) is required. Minimizing
$\rho $ in
(3) is often called ridge regression.
F04QAF is based upon algorithm LSQR (see
Paige and Saunders (1982a) and
Paige and Saunders (1982b)) and solves the problems by an algorithm based upon the Lanczos process. The routine does not require
$A$ explicitly, but
$A$ is specified via
APROD which must perform the operations
$\left(y+Ax\right)$ and
$\left(x+{A}^{\mathrm{T}}y\right)$ for a given
$n$element vector
$x$ and
$m$ element vector
$y$. A parameter to
APROD specifies which of the two operations is required on a given entry.
The routine also returns estimates of the standard errors of the sample regression coefficients (
${x}_{\mathit{i}}$, for
$\mathit{i}=1,2,\dots ,n$) given by the diagonal elements of the estimated variancecovariance matrix
$V$. When problem
(2) is being solved and
$A$ is of full rank, then
$V$ is given by
and when problem
(3) is being solved then
$V$ is given by
Let
$\stackrel{}{A}$ denote the matrix
let
$\stackrel{}{r}$ denote the residual vector
corresponding to an iterate
$x$, so that
$\rho =\Vert \stackrel{}{r}\Vert $ is the function being minimized, and let
$\Vert A\Vert $ denote the Frobenius (Euclidean) norm of
$A$. Then the routine accepts
$x$ as a solution if it is estimated that one of the following two conditions is satisfied:
where
${\mathit{tol}}_{1}$ and
${\mathit{tol}}_{2}$ are usersupplied tolerances which estimate the relative errors in
$A$ and
$b$ respectively. Condition
(6) is appropriate for compatible problems where, in theory, we expect the residual to be zero and will be satisfied by an acceptable solution
$x$ to a compatible problem. Condition
(7) is appropriate for incompatible systems where we do not expect the residual to be zero and is based on the observation that, in theory,
when
$x$ is a solution to the least squares problem, and so
(7) will be satisfied by an acceptable solution
$x$ to a linear least squares problem.
The routine also includes a test to prevent convergence to solutions,
$x$, with unacceptably large elements. This can happen if
$A$ is nearly singular or is nearly rank deficient. If we let the singular values of
$\stackrel{}{A}$ be
then the condition number of
$\stackrel{}{A}$ is defined as
where
${\sigma}_{k}$ is the smallest nonzero singular value of
$\stackrel{}{A}$ and hence
$k$ is the rank of
$\stackrel{}{A}$. When
$k<n$, then
$\stackrel{}{A}$ is rank deficient, the least squares solution is not unique and F04QAF will normally converge to the minimal length solution. In practice
$\stackrel{}{A}$ will not have exactly zero singular values, but may instead have small singular values that we wish to regard as zero.
The routine provides for this possibility by terminating if
where
${c}_{\mathrm{lim}}$ is a usersupplied limit on the condition number of
$\stackrel{}{A}$. For problem
(1) termination with this condition indicates that
$A$ is nearly singular and for problem
(2) indicates that
$A$ is nearly rank deficient and so has near linear dependencies in its columns. In this case inspection of
$\Vert r\Vert $,
$\Vert {A}^{\mathrm{T}}r\Vert $ and
$\Vert x\Vert $, which are all returned by the routine, will indicate whether or not an acceptable solution has been found. Condition
(8), perhaps in conjunction with
$\lambda \ne 0$, can be used to try and ‘regularize’ least squares solutions. A full discussion of the stopping criteria is given in Section 6 of
Paige and Saunders (1982a).
Introduction of a nonzero damping parameter
$\lambda $ tends to reduce the size of the computed solution and to make its components less sensitive to changes in the data, and F04QAF is applicable when a value of
$\lambda $ is known
a priori. To have an effect,
$\lambda $ should normally be at least
$\sqrt{\epsilon}\Vert A\Vert $ where
$\epsilon $ is the
machine precision. For further discussion see
Paige and Saunders (1982b) and the references given there.
Whenever possible the matrix $A$ should be scaled so that the relative errors in the elements of $A$ are all of comparable size. Such a scaling helps to prevent the least squares problem from being unnecessarily sensitive to data errors and will normally reduce the number of iterations required. At the very least, in the absence of better information, the columns of $A$ should be scaled to have roughly equal column length.
4 References
Paige C C and Saunders M A (1982a) LSQR: An algorithm for sparse linear equations and sparse leastsquares ACM Trans. Math. Software 8 43–71
Paige C C and Saunders M A (1982b) Algorithm 583 LSQR: Sparse linear equations and leastsquares problems ACM Trans. Math. Software 8 195–209
5 Parameters
 1: M – INTEGERInput
On entry: $m$, the number of rows of the matrix $A$.
Constraint:
${\mathbf{M}}\ge 1$.
 2: N – INTEGERInput
On entry: $n$, the number of columns of the matrix $A$.
Constraint:
${\mathbf{N}}\ge 1$.
 3: B(M) – REAL (KIND=nag_wp) arrayInput/Output
On entry: the righthand side vector $b$.
On exit:
B is overwritten.
 4: X(N) – REAL (KIND=nag_wp) arrayOutput
On exit: the solution vector $x$.
 5: SE(N) – REAL (KIND=nag_wp) arrayOutput
On exit: the estimates of the standard errors of the components of
$x$. Thus
${\mathbf{SE}}\left(i\right)$ contains an estimate of
$\sqrt{{\nu}_{ii}}$, where
${\nu}_{ii}$ is the
$i$th diagonal element of the estimated variancecovariance matrix
$V$. The estimates returned in
SE will be the lower bounds on the actual estimated standard errors, but will usually be correct to at least one significant figure.
 6: APROD – SUBROUTINE, supplied by the user.External Procedure
APROD must perform the operations
$y:=y+Ax$ and
$x:=x+{A}^{\mathrm{T}}y$ for given vectors
$x$ and
$y$.
The specification of
APROD is:
INTEGER 
MODE, M, N, LRUSER, IUSER(LIUSER), LIUSER 
REAL (KIND=nag_wp) 
X(N), Y(M), RUSER(LRUSER) 

 1: MODE – INTEGERInput/Output
On entry: specifies which operation is to be performed.
 ${\mathbf{MODE}}=1$
 APROD must compute $y+Ax$.
 ${\mathbf{MODE}}=2$
 APROD must compute $x+{A}^{\mathrm{T}}y$.
On exit: may be used as a flag to indicate a failure in the computation of
$y+Ax$ or
$x+{A}^{\mathrm{T}}y$. If
MODE is negative on exit from
APROD, F04QAF will exit immediately with
IFAIL set to
MODE.
 2: M – INTEGERInput
On entry: $m$, the number of rows of $A$.
 3: N – INTEGERInput
On entry: $n$, the number of columns of $A$.
 4: X(N) – REAL (KIND=nag_wp) arrayInput/Output
On entry: the vector $x$.
On exit: if
${\mathbf{MODE}}=1$,
X must be unchanged.
If
${\mathbf{MODE}}=2$,
X must contain
$x+{A}^{\mathrm{T}}y$.
 5: Y(M) – REAL (KIND=nag_wp) arrayInput/Output
On entry: the vector $y$.
On exit: if
${\mathbf{MODE}}=1$,
Y must contain
$y+Ax$.
If
${\mathbf{MODE}}=2$,
Y must be unchanged.
 6: RUSER(LRUSER) – REAL (KIND=nag_wp) arrayUser Workspace
 7: LRUSER – INTEGERInput
 8: IUSER(LIUSER) – INTEGER arrayUser Workspace
 9: LIUSER – INTEGERInput

APROD is called with the parameters
RUSER,
LRUSER,
IUSER and
LIUSER as supplied to F04QAF. You are free to use the arrays
RUSER,
LRUSER,
IUSER and
LIUSER to supply information to
APROD as an alternative to using COMMON global variables.
APROD must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which F04QAF is called. Parameters denoted as
Input must
not be changed by this procedure.
 7: DAMP – REAL (KIND=nag_wp)Input
On entry: the value
$\lambda $. If either problem
(1) or problem
(2) is to be solved, then
DAMP must be supplied as zero.
 8: ATOL – REAL (KIND=nag_wp)Input
On entry: the tolerance,
${\mathit{tol}}_{1}$, of the convergence criteria
(6) and
(7); it should be an estimate of the largest relative error in the elements of
$A$. For example, if the elements of
$A$ are correct to about
$4$ significant figures, then
ATOL should be set to about
$5\times {10}^{4}$. If
ATOL is supplied as less than
$\epsilon $, where
$\epsilon $ is the
machine precision, then the value
$\epsilon $ is used instead of
ATOL.
 9: BTOL – REAL (KIND=nag_wp)Input
On entry: the tolerance,
${\mathit{tol}}_{2}$, of the convergence criterion
(6); it should be an estimate of the largest relative error in the elements of
$B$. For example, if the elements of
$B$ are correct to about
$4$ significant figures, then
BTOL should be set to about
$5\times {10}^{4}$. If
BTOL is supplied as less than
$\epsilon $, then the value
$\epsilon $ is used instead of
BTOL.
 10: CONLIM – REAL (KIND=nag_wp)Input
On entry: the value
${c}_{\mathrm{lim}}$ of equation
(8); it should be an upper limit on the condition number of
$\stackrel{}{A}$.
CONLIM should not normally be chosen much larger than
$1.0/{\mathbf{ATOL}}$. If
CONLIM is supplied as zero, then the value
$1.0/\epsilon $ is used instead of
CONLIM.
 11: ITNLIM – INTEGERInput/Output
On entry: an upper limit on the number of iterations. If
${\mathbf{ITNLIM}}\le 0$, then the value
N is used in place of
ITNLIM, but for illconditioned problems a higher value of
ITNLIM is likely to be necessary.
On exit: unchanged unless
${\mathbf{ITNLIM}}\le 0$ on entry, in which case it is set to
N.
 12: MSGLVL – INTEGERInput
On entry: the level of printing from F04QAF. If
${\mathbf{MSGLVL}}\le 0$, then no printing occurs, but otherwise messages will be output on the advisory message channel (see
X04ABF). A description of the printed output is given in
Section 8.1. The level of printing is determined as follows:
 ${\mathbf{MSGLVL}}\le 0$
 No printing.
 ${\mathbf{MSGLVL}}=1$
 A brief summary is printed just prior to return from F04QAF.
 ${\mathbf{MSGLVL}}\ge 2$
 A summary line is printed periodically to monitor the progress of F04QAF, together with a brief summary just prior to return from F04QAF.
 13: ITN – INTEGEROutput
On exit: the number of iterations performed.
 14: ANORM – REAL (KIND=nag_wp)Output
On exit: an estimate of
$\Vert \stackrel{}{A}\Vert $ for the matrix
$\stackrel{}{A}$ of
(4).
 15: ACOND – REAL (KIND=nag_wp)Output
On exit: an estimate of $\mathrm{cond}\left(\stackrel{}{A}\right)$ which is a lower bound.
 16: RNORM – REAL (KIND=nag_wp)Output
On exit: an estimate of
$\Vert \stackrel{}{r}\Vert $ for the residual,
$\stackrel{}{r}$, of
(5) corresponding to the solution
$x$ returned in
X. Note that
$\Vert \stackrel{}{r}\Vert $ is the function being minimized.
 17: ARNORM – REAL (KIND=nag_wp)Output
On exit: an estimate of the
$\Vert {\stackrel{}{A}}^{\mathrm{T}}\stackrel{}{r}\Vert $ corresponding to the solution
$x$ returned in
X.
 18: XNORM – REAL (KIND=nag_wp)Output
On exit: an estimate of
$\Vert x\Vert $ for the solution
$x$ returned in
X.
 19: WORK(N,$2$) – REAL (KIND=nag_wp) arrayWorkspace
 20: RUSER(LRUSER) – REAL (KIND=nag_wp) arrayUser Workspace

RUSER is not used by F04QAF, but is passed directly to
APROD and may be used to pass information to this routine as an alternative to using COMMON global variables.
 21: LRUSER – INTEGERInput
On entry: the dimension of the array
RUSER as declared in the (sub)program from which F04QAF is called.
Constraint:
${\mathbf{LRUSER}}\ge 1$.
 22: IUSER(LIUSER) – INTEGER arrayUser Workspace

IUSER is not used by F04QAF, but is passed directly to
APROD and may be used to pass information to this routine as an alternative to using COMMON global variables.
 23: LIUSER – INTEGERInput
On entry: the dimension of the array
IUSER as declared in the (sub)program from which F04QAF is called.
Constraint:
${\mathbf{LIUSER}}\ge 1$.
 24: INFORM – INTEGEROutput
On exit: the reason for termination of F04QAF.
 ${\mathbf{INFORM}}=0$
 The exact solution is $x=0$. No iterations are performed in this case.
 ${\mathbf{INFORM}}=1$
 The termination criterion of (6) has been satisfied with ${\mathit{tol}}_{1}$ and ${\mathit{tol}}_{2}$ as the values supplied in ATOL and BTOL respectively.
 ${\mathbf{INFORM}}=2$
 The termination criterion of (7) has been satisfied with ${\mathit{tol}}_{1}$ as the value supplied in ATOL.
 ${\mathbf{INFORM}}=3$
 The termination criterion of (6) has been satisfied with ${\mathit{tol}}_{1}$ and/or ${\mathit{tol}}_{2}$ as the value $\epsilon $, where $\epsilon $ is the machine precision. One or both of the values supplied in ATOL and BTOL must have been less than $\epsilon $ and was too small for this machine.
 ${\mathbf{INFORM}}=4$
 The termination criterion of (7) has been satisfied with ${\mathit{tol}}_{1}$ as the value $\epsilon $. The value supplied in ATOL must have been less than $\epsilon $ and was too small for this machine.
The values
${\mathbf{INFORM}}=5$,
$6$ and
$7$ correspond to failure with
${\mathbf{IFAIL}}={\mathbf{2}}$,
${\mathbf{3}}$ and
${\mathbf{4}}$ respectively (see
Section 6) and when
IFAIL is negative
INFORM will be set to the same negative value.
 25: IFAIL – INTEGERInput/Output

On entry:
IFAIL must be set to
$0$,
$1\text{ or}1$. If you are unfamiliar with this parameter you should refer to
Section 3.3 in the Essential Introduction for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
$1\text{ or}1$ is recommended. If the output of error messages is undesirable, then the value
$1$ is recommended. Otherwise, if you are not familiar with this parameter, the recommended value is
$0$.
When the value $\mathbf{1}\text{ or}\mathbf{1}$ is used it is essential to test the value of IFAIL on exit.
On exit:
${\mathbf{IFAIL}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see
Section 6).
6 Error Indicators and Warnings
Errors detected by the routine:
 ${\mathbf{IFAIL}}<0$
A negative value of
IFAIL indicates an exit from F04QAF because you have set
MODE negative in
APROD. The value of
IFAIL will be the same as your setting of
MODE.
 ${\mathbf{IFAIL}}=1$
On entry,  ${\mathbf{M}}<1$, 
or  ${\mathbf{N}}<1$, 
or  ${\mathbf{LRUSER}}<1$, 
or  ${\mathbf{LIUSER}}<1$. 
 ${\mathbf{IFAIL}}=2$
The condition of
(8) has been satisfied for the value of
${c}_{\mathrm{lim}}$ supplied in
CONLIM. If this failure is unexpected you should check that
APROD is working correctly. Although conditions
(6) or
(7) have not been satisfied, the values returned in
RNORM,
ARNORM and
XNORM may nevertheless indicate that an acceptable solution has been reached.
 ${\mathbf{IFAIL}}=3$
The condition of
(8) has been satisfied for the value
${c}_{\mathrm{lim}}=1.0/\epsilon $, where
$\epsilon $ is the
machine precision. The matrix
$\stackrel{}{A}$ is nearly singular or rank deficient and the problem is too illconditioned for this machine. If this failure is unexpected, you should check that
APROD is working correctly.
 ${\mathbf{IFAIL}}=4$
The limit on the number of iterations has been reached. The number of iterations required by F04QAF and the condition of the matrix $\stackrel{}{A}$ can depend strongly on the scaling of the problem. Poor scaling of the rows and columns of $A$ should be avoided whenever possible.
7 Accuracy
When the problem is compatible, the computed solution
$x$ will satisfy the equation
where an estimate of
$\Vert r\Vert $ is returned in the parameter
RNORM. When the problem is incompatible, the computed solution
$x$ will satisfy the equation
where an estimate of
$\Vert e\Vert $ is returned in the parameter
ARNORM. See also Section 6.2 of
Paige and Saunders (1982b).
The time taken by F04QAF is likely to be principally determined by the time taken in
APROD, which is called twice on each iteration, once with
${\mathbf{MODE}}=1$ and once with
${\mathbf{MODE}}=2$. The time taken per iteration by the remaining operations in F04QAF is approximately proportional to
$\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(m,n\right)$.
The Lanczos process will usually converge more quickly if
$A$ is preconditioned by a nonsingular matrix
$M$ that approximates
$A$ in some sense and is also chosen so that equations of the form
$My=c$ can efficiently be solved for
$y$. For a discussion of preconditioning, see the
F11 Chapter Introduction. In the context of F04QAF, problem
(1) is equivalent to
and problem
(2) is equivalent to minimizing
Note that the normal matrix
${\left(A{M}^{1}\right)}^{\mathrm{T}}\left(A{M}^{1}\right)={M}^{\mathrm{T}}\left({A}^{\mathrm{T}}A\right){M}^{1}$ so that the preconditioning
$A{M}^{1}$ is equivalent to the preconditioning
${M}^{\mathrm{T}}\left({A}^{\mathrm{T}}A\right){M}^{1}$ of the normal matrix
${A}^{\mathrm{T}}A$.
Preconditioning can be incorporated into F04QAF simply by coding
APROD to compute
$y+A{M}^{1}x$ and
$x+{M}^{\mathrm{T}}{A}^{\mathrm{T}}y$ in place of
$y+Ax$ and
$x+{A}^{\mathrm{T}}y$ respectively, and then solving the equations
$Mx=y$ for
$x$ on return from F04QAF. The quantity
$y+A{M}^{1}x$ should be computed by solving
$Mz=x$ for
$z$ and then computing
$y+Az$, and
$x+{M}^{\mathrm{T}}{A}^{\mathrm{T}}y$ should be computed by solving
${M}^{\mathrm{T}}z={A}^{\mathrm{T}}y$ for
$z$ and then forming
$x+z$.
8.1 Description of the Printed Output
When
${\mathbf{MSGLVL}}>0$, then F04QAF will produce output (except in the case where the routine fails with
${\mathbf{IFAIL}}={\mathbf{1}}$) on the advisory message channel (see
X04ABF).
When
${\mathbf{MSGLVL}}\ge 2$ then a summary line is printed periodically giving the following information:
Output 
Meaning 
ITN 
Iteration number, $k$. 
X(1) 
The first element of the current iterate ${x}_{k}$. 
FUNCTION 
The current value of the function, $\rho $, being minimized. 
COMPAT 
An estimate of $\Vert {\stackrel{}{r}}_{k}\Vert /\Vert b\Vert $, where ${\stackrel{}{r}}_{k}$ is the residual corresponding to ${x}_{k}$. This value should converge to zero (in theory) if and only if the problem is compatible. COMPAT decreases monotonically. 
INCOMPAT 
An estimate of $\Vert {\stackrel{}{A}}^{\mathrm{T}}{\stackrel{}{r}}_{k}\Vert /\left(\Vert \stackrel{}{A}\Vert \Vert {\stackrel{}{r}}_{k}\Vert \right)$ which should converge to zero if and only if at the solution $\rho $ is nonzero. INCOMPAT is not usually monotonic. 
NRM(ABAR) 
A monotonically increasing estimate of $\Vert \stackrel{}{A}\Vert $. 
COND(ABAR) 
A monotonically increasing estimate of the condition number $\mathrm{cond}\left(\stackrel{}{A}\right)$. 
9 Example
This example solves the linear least squares problem
where
$A$ is the
$13$ by
$12$ matrix and
$b$ is the
$13$ element vector given by
with
$h=0.1$.
Such a problem can arise by considering the Neumann problem on a rectangle
where
$C$ is the boundary of the rectangle, and discretizing as illustrated below with the square mesh
Figure 1
The $12$ by $12$ symmetric part of $A$ represents the difference equations and the final row comes from the normalizing condition. The example program has $g\left(x,y\right)=1$ at all the internal mesh points, but apart from this is written in a general manner so that the number of rows (NROWS) and columns (NCOLS) in the grid can readily be altered.
9.1 Program Text
Program Text (f04qafe.f90)
9.2 Program Data
Program Data (f04qafe.d)
9.3 Program Results
Program Results (f04qafe.r)