Integer type:  int32  int64  nag_int  show int32  show int32  show int64  show int64  show nag_int  show nag_int

Chapter Contents
Chapter Introduction
NAG Toolbox

# NAG Toolbox: nag_stat_prob_hypergeom_vector (g01sl)

## Purpose

nag_stat_prob_hypergeom_vector (g01sl) returns a number of the lower tail, upper tail and point probabilities for the hypergeometric distribution.

## Syntax

[plek, pgtk, peqk, ivalid, ifail] = g01sl(n, l, m, k, 'ln', ln, 'll', ll, 'lm', lm, 'lk', lk)
[plek, pgtk, peqk, ivalid, ifail] = nag_stat_prob_hypergeom_vector(n, l, m, k, 'ln', ln, 'll', ll, 'lm', lm, 'lk', lk)

## Description

Let X = {Xi : i = 1 , 2 ,, r } $X=\left\{{X}_{i}:i=1,2,\dots ,r\right\}$ denote a vector of random variables having a hypergeometric distribution with parameters ni${n}_{i}$, li${l}_{i}$ and mi${m}_{i}$. Then
Prob{Xi = ki} =
 (mi) ki ( ni − mi ) li − ki
(ni) li ,
$Prob{ Xi = ki } = mi ki ni - mi li - ki ni li ,$
where max (0, li + mi ni ) ki min (li,mi) $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(0,{l}_{i}+{m}_{i}-{n}_{i}\right)\le {k}_{i}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({l}_{i},{m}_{i}\right)$, 0lini$0\le {l}_{i}\le {n}_{i}$ and 0mini$0\le {m}_{i}\le {n}_{i}$.
The hypergeometric distribution may arise if in a population of size ni${n}_{i}$ a number mi${m}_{i}$ are marked. From this population a sample of size li${l}_{i}$ is drawn and of these ki${k}_{i}$ are observed to be marked.
The mean of the distribution = (limi)/(ni) $\text{}=\frac{{l}_{i}{m}_{i}}{{n}_{i}}$, and the variance = (limi(nili)(nimi))/(ni2(ni1)) $\text{}=\frac{{l}_{i}{m}_{i}\left({n}_{i}-{l}_{i}\right)\left({n}_{i}-{m}_{i}\right)}{{{n}_{i}}^{2}\left({n}_{i}-1\right)}$.
nag_stat_prob_hypergeom_vector (g01sl) computes for given ni${n}_{i}$, li${l}_{i}$, mi${m}_{i}$ and ki${k}_{i}$ the probabilities: Prob{Xiki}$\mathrm{Prob}\left\{{X}_{i}\le {k}_{i}\right\}$, Prob{Xi > ki}$\mathrm{Prob}\left\{{X}_{i}>{k}_{i}\right\}$ and Prob{Xi = ki}$\mathrm{Prob}\left\{{X}_{i}={k}_{i}\right\}$ using an algorithm similar to that described in Knüsel (1986) for the Poisson distribution.
The input arrays to this function are designed to allow maximum flexibility in the supply of vector parameters by re-using elements of any arrays that are shorter than the total number of evaluations required. See Section [Vectorized s] in the G01 Chapter Introduction for further information.

## References

Knüsel L (1986) Computation of the chi-square and Poisson distribution SIAM J. Sci. Statist. Comput. 7 1022–1036

## Parameters

### Compulsory Input Parameters

1:     n(ln) – int64int32nag_int array
ln, the dimension of the array, must satisfy the constraint ln > 0${\mathbf{ln}}>0$.
ni${n}_{i}$, the parameter of the hypergeometric distribution with ni = n(j)${n}_{i}={\mathbf{n}}\left(j\right)$, j = ((i1)  mod  ln) + 1, for i = 1,2,,max (ln,ll,lm,lk)$i=1,2,\dots ,\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$.
Constraint: n(j)0${\mathbf{n}}\left(\mathit{j}\right)\ge 0$, for j = 1,2,,ln$\mathit{j}=1,2,\dots ,{\mathbf{ln}}$.
2:     l(ll) – int64int32nag_int array
ll, the dimension of the array, must satisfy the constraint ll > 0${\mathbf{ll}}>0$.
li${l}_{i}$, the parameter of the hypergeometric distribution with li = l(j)${l}_{i}={\mathbf{l}}\left(j\right)$, j = ((i1)  mod  ll) + 1.
Constraint: 0 li ni $0\le {l}_{i}\le {n}_{i}$.
3:     m(lm) – int64int32nag_int array
lm, the dimension of the array, must satisfy the constraint lm > 0${\mathbf{lm}}>0$.
mi${m}_{i}$, the parameter of the hypergeometric distribution with mi = m(j)${m}_{i}={\mathbf{m}}\left(j\right)$, j = ((i1)  mod  lm) + 1.
Constraint: 0 mi ni $0\le {m}_{i}\le {n}_{i}$.
4:     k(lk) – int64int32nag_int array
lk, the dimension of the array, must satisfy the constraint lk > 0${\mathbf{lk}}>0$.
ki${k}_{i}$, the integer which defines the required probabilities with ki = k(j)${k}_{i}={\mathbf{k}}\left(j\right)$, j = ((i1)  mod  lk) + 1.
Constraint: max (0, li + mi ni ) ki min (li,mi) $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(0,{l}_{i}+{m}_{i}-{n}_{i}\right)\le {k}_{i}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({l}_{i},{m}_{i}\right)$.

### Optional Input Parameters

1:     ln – int64int32nag_int scalar
Default: The dimension of the array n.
The length of the array n
Constraint: ln > 0${\mathbf{ln}}>0$.
2:     ll – int64int32nag_int scalar
Default: The dimension of the array l.
The length of the array l
Constraint: ll > 0${\mathbf{ll}}>0$.
3:     lm – int64int32nag_int scalar
Default: The dimension of the array m.
The length of the array m
Constraint: lm > 0${\mathbf{lm}}>0$.
4:     lk – int64int32nag_int scalar
Default: The dimension of the array k.
The length of the array k
Constraint: lk > 0${\mathbf{lk}}>0$.

None.

### Output Parameters

1:     plek( : $:$) – double array
Note: the dimension of the array plek must be at least max (ln,ll,lm,lk)$\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$.
Prob{Xiki} $\mathrm{Prob}\left\{{X}_{i}\le {k}_{i}\right\}$, the lower tail probabilities.
2:     pgtk( : $:$) – double array
Note: the dimension of the array pgtk must be at least max (ln,ll,lm,lk)$\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$.
Prob{Xi > ki} $\mathrm{Prob}\left\{{X}_{i}>{k}_{i}\right\}$, the upper tail probabilities.
3:     peqk( : $:$) – double array
Note: the dimension of the array peqk must be at least max (ln,ll,lm,lk)$\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$.
Prob{Xi = ki} $\mathrm{Prob}\left\{{X}_{i}={k}_{i}\right\}$, the point probabilities.
4:     ivalid( : $:$) – int64int32nag_int array
Note: the dimension of the array ivalid must be at least max (ln,ll,lm,lk)$\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{ln}},{\mathbf{ll}},{\mathbf{lm}},{\mathbf{lk}}\right)$.
ivalid(i)${\mathbf{ivalid}}\left(i\right)$ indicates any errors with the input arguments, with
ivalid(i) = 0${\mathbf{ivalid}}\left(i\right)=0$
No error.
ivalid(i) = 1${\mathbf{ivalid}}\left(i\right)=1$
 On entry, ni < 0${n}_{i}<0$.
ivalid(i) = 2${\mathbf{ivalid}}\left(i\right)=2$
 On entry, li < 0${l}_{i}<0$, or li > ni${l}_{i}>{n}_{i}$.
ivalid(i) = 3${\mathbf{ivalid}}\left(i\right)=3$
 On entry, mi < 0${m}_{i}<0$, or mi > ni${m}_{i}>{n}_{i}$.
ivalid(i) = 4${\mathbf{ivalid}}\left(i\right)=4$
 On entry, ki < 0${k}_{i}<0$, or ki > li${k}_{i}>{l}_{i}$, or ki > mi${k}_{i}>{m}_{i}$, or ki < li + mi − ni${k}_{i}<{l}_{i}+{m}_{i}-{n}_{i}$.
ivalid(i) = 5${\mathbf{ivalid}}\left(i\right)=5$
 On entry, ni${n}_{i}$ is too large to be represented exactly as a real number.
ivalid(i) = 6${\mathbf{ivalid}}\left(i\right)=6$
 On entry, the variance (see Section [Description]) exceeds 106${10}^{6}$.
5:     ifail – int64int32nag_int scalar
${\mathrm{ifail}}={\mathbf{0}}$ unless the function detects an error (see [Error Indicators and Warnings]).

## Error Indicators and Warnings

Errors or warnings detected by the function:

Cases prefixed with W are classified as warnings and do not generate an error of type NAG:error_n. See nag_issue_warnings.

W ifail = 1${\mathbf{ifail}}=1$
On entry, at least one value of n, l, m or k was invalid, or the variance was too large.
ifail = 2${\mathbf{ifail}}=2$
Constraint: ln > 0${\mathbf{ln}}>0$.
ifail = 3${\mathbf{ifail}}=3$
Constraint: ll > 0${\mathbf{ll}}>0$.
ifail = 4${\mathbf{ifail}}=4$
Constraint: lm > 0${\mathbf{lm}}>0$.
ifail = 5${\mathbf{ifail}}=5$
Constraint: lk > 0${\mathbf{lk}}>0$.
ifail = 999${\mathbf{ifail}}=-999$
Dynamic memory allocation failed.

## Accuracy

Results are correct to a relative accuracy of at least 106${10}^{-6}$ on machines with a precision of 9$9$ or more decimal digits (provided that the results do not underflow to zero).

The time taken by nag_stat_prob_hypergeom_vector (g01sl) to calculate each probability depends on the variance (see Section [Description]) and on ki${k}_{i}$. For given variance, the time is greatest when kilimi / ni${k}_{i}\approx {l}_{i}{m}_{i}/{n}_{i}$ ( = $=$ the mean), and is then approximately proportional to the square-root of the variance.

## Example

function nag_stat_prob_hypergeom_vector_example
n = [int64(10); 40; 155; 1000];
l = [int64(2); 10; 35; 444];
m = [int64(5); 3; 122; 500];
k = [int64(1); 2; 22; 220];
[plek, pgtk, peqk, ivalid, ifail] = nag_stat_prob_hypergeom_vector(n, l, m, k);

fprintf('\n   N   L   M   K     PLEK      PGTK      PEQK\n');
ln = numel(n);
ll = numel(l);
lm = numel(m);
lk = numel(k);
len = max ([ln, ll, lm, lk]);
for i=0:len-1
fprintf('%4d%4d%4d%4d%10.5f%10.5f%10.5f%3d\n', n(mod(i,ln)+1), l(mod(i,ll)+1), ...
m(mod(i,lm)+1), k(mod(i,lk)+1), plek(i+1), pgtk(i+1), peqk(i+1), ...
ivalid(i+1));
end

N   L   M   K     PLEK      PGTK      PEQK
10   2   5   1   0.77778   0.22222   0.55556  0
40  10   3   2   0.98785   0.01215   0.13664  0
155  35 122  22   0.01101   0.98899   0.00779  0
1000 444 500 220   0.42429   0.57571   0.04913  0

function g01sl_example
n = [int64(10); 40; 155; 1000];
l = [int64(2); 10; 35; 444];
m = [int64(5); 3; 122; 500];
k = [int64(1); 2; 22; 220];
[plek, pgtk, peqk, ivalid, ifail] = g01sl(n, l, m, k);

fprintf('\n   N   L   M   K     PLEK      PGTK      PEQK\n');
ln = numel(n);
ll = numel(l);
lm = numel(m);
lk = numel(k);
len = max ([ln, ll, lm, lk]);
for i=0:len-1
fprintf('%4d%4d%4d%4d%10.5f%10.5f%10.5f%3d\n', n(mod(i,ln)+1), l(mod(i,ll)+1), ...
m(mod(i,lm)+1), k(mod(i,lk)+1), plek(i+1), pgtk(i+1), peqk(i+1), ...
ivalid(i+1));
end

N   L   M   K     PLEK      PGTK      PEQK
10   2   5   1   0.77778   0.22222   0.55556  0
40  10   3   2   0.98785   0.01215   0.13664  0
155  35 122  22   0.01101   0.98899   0.00779  0
1000 444 500 220   0.42429   0.57571   0.04913  0