negbinomial {VGAM}R Documentation

Negative Binomial Distribution Family Function

Description

Maximum likelihood estimation of the two parameters of a negative binomial distribution.

Usage

negbinomial(lmu = "loge", lsize = "loge", emu = list(), esize = list(),
            imu = NULL, isize = NULL, quantile.probs = 0.75,
            nsimEIM = 100, cutoff = 0.995,
            Maxiter = 5000, deviance.arg = FALSE, imethod = 1,
            parallel = FALSE, shrinkage.init = 0.95, zero = -2)
polya(lprob = "logit", lsize = "loge", eprob = list(), esize = list(),
    iprob = NULL, isize = NULL, quantile.probs = 0.75, nsimEIM = 100,
    deviance.arg = FALSE, imethod = 1, shrinkage.init = 0.95, zero = -2)

Arguments

lmu, lsize, lprob

Link functions applied to the mu, k and p parameters. See Links for more choices. Note that the mu, k and p parameters are the mu, size and prob arguments of rnbinom respectively. Common alternatives for lsize are nloge and reciprocal.

emu, esize, eprob

List. Extra argument for each of the links. See earg in Links for general information.

imu, isize, iprob

Optional initial values for the mean and k and p. For k, if failure to converge occurs then try different values (and/or use imethod). For a S-column response, isize can be of length S. A value NULL means an initial value for each response is computed internally using a range of values. The last argument is ignored if used within cqo; see the iKvector argument of qrrvglm.control instead.

quantile.probs

Passed into the probs argument of quantile when imethod = 3 to obtain an initial value for the mean.

nsimEIM

This argument is used for computing the diagonal element of the expected information matrix (EIM) corresponding to k. See CommonVGAMffArguments for more information and the note below.

cutoff

Used in the finite series approximation. A numeric which is close to 1 but never exactly 1. Used to specify how many terms of the infinite series for computing the second diagonal element of the EIM are actually used. The sum of the probabilites are added until they reach this value or more (but no more than Maxiter terms allowed). It is like specifying p in an imaginary function qnegbin(p).

Maxiter

Used in the finite series approximation. Integer. The maximum number of terms allowed when computing the second diagonal element of the EIM. In theory, the value involves an infinite series. If this argument is too small then the value may be inaccurate.

deviance.arg

Logical. If TRUE, the deviance function is attached to the object. Under ordinary circumstances, it should be left alone because it really assumes the index parameter is at the maximum likelihood estimate. Consequently, one cannot use that criterion to minimize within the IRLS algorithm. It should be set TRUE only when used with cqo under the fast algorithm.

imethod

An integer with value 1 or 2 or 3 which specifies the initialization method for the mu parameter. If failure to converge occurs try another value and/or else specify a value for shrinkage.init and/or else specify a value for isize.

parallel

See CommonVGAMffArguments for more information. Setting parallel = TRUE is useful in order to get something similar to quasipoissonff or what is known as NB-1. The parallelism constraint does not apply to any intercept term. You should set zero = NULL too if parallel = TRUE to avoid a conflict.

shrinkage.init

How much shrinkage is used when initializing mu. The value must be between 0 and 1 inclusive, and a value of 0 means the individual response values are used, and a value of 1 means the median or mean is used. This argument is used in conjunction with imethod. If convergence failure occurs try setting this argument to 1.

zero

Integer valued vector, usually assigned -2 or 2 if used at all. Specifies which of the two linear/additive predictors are modelled as an intercept only. By default, the k parameter (after lsize is applied) is modelled as a single unknown number that is estimated. It can be modelled as a function of the explanatory variables by setting zero = NULL. A negative value means that the value is recycled, so setting -2 means all k are intercept-only. See CommonVGAMffArguments for more information.

Details

The negative binomial distribution can be motivated in several ways, e.g., as a Poisson distribution with a mean that is gamma distributed. There are several common parametrizations of the negative binomial distribution. The one used by negbinomial() uses the mean mu and an index parameter k, both which are positive. Specifically, the density of a random variable Y is

f(y;mu,k) = C_{y}^{y + k - 1} [mu/(mu+k)]^y [k/(k+mu)]^k

where y=0,1,2,…, and mu > 0 and k > 0. Note that the dispersion parameter is 1/k, so that as k approaches infinity the negative binomial distribution approaches a Poisson distribution. The response has variance Var(Y)=mu*(1+mu/k). When fitted, the fitted.values slot of the object contains the estimated value of the mu parameter, i.e., of the mean E(Y). It is common for some to use alpha=1/k as the ancillary or heterogeneity parameter; so common alternatives for lsize are nloge and reciprocal.

For polya the density is

f(y;p,k) = C_{y}^{y + k - 1} [1 - p]^y p^k

where y=0,1,2,…, and 0 < p < 1 and k > 0.

The negative binomial distribution can be coerced into the classical GLM framework with one of the parameters being of interest and the other treated as a nuisance/scale parameter (this is implemented in the MASS library). The VGAM family function negbinomial treats both parameters on the same footing, and estimates them both by full maximum likelihood estimation. Simulated Fisher scoring is employed as the default (see the nsimEIM argument).

The parameters mu and k are independent (diagonal EIM), and the confidence region for k is extremely skewed so that its standard error is often of no practical use. The parameter 1/k has been used as a measure of aggregation.

These VGAM family functions handle multivariate responses, so that a matrix can be used as the response. The number of columns is the number of species, say, and setting zero = -2 means that all species have a k equalling a (different) intercept only.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, rrvglm and vgam.

Warning

The Poisson model corresponds to k equalling infinity. If the data is Poisson or close to Poisson, numerical problems will occur. Possibly choosing a log-log link may help in such cases, otherwise use poissonff or quasipoissonff.

These functions are fragile; the maximum likelihood estimate of the index parameter is fraught (see Lawless, 1987). In general, the quasipoissonff is more robust. Other alternatives to negbinomial are to fit a NB-1 or RR-NB model; see Yee (2011). Assigning values to the isize argument may lead to a local solution, and smaller values are preferred over large values when using this argument.

Yet to do: write a family function which uses the methods of moments estimator for k.

Note

These two functions implement two common parameterizations of the negative binomial (NB). Some people called the NB with integer k the Pascal distribution, whereas if k is real then this is the Polya distribution. I don't. The one matching the details of rnbinom in terms of p and k is polya().

For polya() the code may fail when p is close to 0 or 1. It is not yet compatible with cqo or cao.

Suppose the response is called ymat. For negbinomial() the diagonal element of the expected information matrix (EIM) for parameter k involves an infinite series; consequently simulated Fisher scoring (see nsimEIM) is the default. This algorithm should definitely be used if max(ymat) is large, e.g., max(ymat) > 300 or there are any outliers in ymat. A second algorithm involving a finite series approximation can be invoked by setting nsimEIM = NULL. Then the arguments Maxiter and cutoff are pertinent.

Regardless of the algorithm used, convergence problems may occur, especially when the response has large outliers or is large in magnitude. If convergence failure occurs, try using arguments (in recommended decreasing order) nsimEIM, shrinkage.init, imethod, Maxiter, cutoff, isize, zero.

The function negbinomial can be used by the fast algorithm in cqo, however, setting EqualTolerances = TRUE and ITolerances = FALSE is recommended.

In the first example below (Bliss and Fisher, 1953), from each of 6 McIntosh apple trees in an orchard that had been sprayed, 25 leaves were randomly selected. On each of the leaves, the number of adult female European red mites were counted.

There are two special uses of negbinomial for handling count data. Firstly, when used by rrvglm this results in a continuum of models in between and inclusive of quasi-Poisson and negative binomial regression. This is known as a reduced-rank negative binomial model (RR-NB). It fits a negative binomial log-linear regression with variance function Var(Y) = mu + delta1 * mu^delta2 where delta1 and delta2 are parameters to be estimated by MLE. Confidence intervals are available for delta2, therefore it can be decided upon whether the data are quasi-Poisson or negative binomial, if any.

Secondly, the use of negbinomial with parallel = TRUE inside vglm can result in a model similar to quasipoissonff. This is named the NB-1 model. The dispersion parameter is estimated by MLE whereas glm uses the method of moments. In particular, it fits a negative binomial log-linear regression with variance function Var(Y) = phi0 * mu where phi0 is a parameter to be estimated by MLE. Confidence intervals are available for phi0.

Author(s)

Thomas W. Yee

References

Lawless, J. F. (1987) Negative binomial and mixed Poisson regression. The Canadian Journal of Statistics 15, 209–225.

Hilbe, J. M. (2007) Negative Binomial Regression. Cambridge: Cambridge University Press.

Bliss, C. and Fisher, R. A. (1953) Fitting the negative binomial distribution to biological data. Biometrics 9, 174–200.

Yee, T. W. (2011) Two-parameter reduced-rank vector generalized linear models. In preparation.

See Also

quasipoissonff, poissonff, zinegbinomial, posnegbinomial, invbinomial, rnbinom, nbolf, rrvglm, cao, cqo, CommonVGAMffArguments.

Examples

# Example 1: apple tree data
appletree <- data.frame(y = 0:7, w = c(70, 38, 17, 10, 9, 3, 2, 1))
fit <- vglm(y ~ 1, negbinomial, appletree, weights = w)
summary(fit)
coef(fit, matrix = TRUE)
Coef(fit)

# Example 2: simulated data with multivariate response
ndata <- data.frame(x2 = runif(nn <- 500))
ndata <- transform(ndata, y1 = rnbinom(nn, mu = exp(3+x2), size = exp(1)),
                          y2 = rnbinom(nn, mu = exp(2-x2), size = exp(0)))
fit1 <- vglm(cbind(y1, y2) ~ x2, negbinomial, ndata, trace = TRUE)
coef(fit1, matrix = TRUE)

# Example 3: large counts so definitely use the nsimEIM argument
ndata <- transform(ndata, y3 = rnbinom(nn, mu = exp(12+x2), size = exp(1)))
with(ndata, range(y3))  # Large counts
fit2 <- vglm(y3 ~ x2, negbinomial(nsimEIM = 100), ndata, trace = TRUE)
coef(fit2, matrix = TRUE)

# Example 4: a NB-1 to estimate a negative binomial with Var(Y) = phi0 * mu
nn <- 1000        # Number of observations
phi0 <- 10        # Specify this; should be greater than unity
delta0 <- 1 / (phi0 - 1)
mydata <- data.frame(x2 = runif(nn), x3 = runif(nn))
mydata <- transform(mydata, mu = exp(2 + 3 * x2 + 0 * x3))
mydata <- transform(mydata, y3 = rnbinom(nn, mu = mu, size = delta0 * mu))
## Not run: 
plot(y3 ~ x2, data = mydata, pch = "+", col = 'blue',
     main = paste("Var(Y) = ", phi0, " * mu", sep = ""), las = 1) 
## End(Not run)
nb1 <- vglm(y3 ~ x2 + x3, negbinomial(parallel = TRUE, zero = NULL),
            mydata, trace = TRUE)
# Extracting out some quantities:
cnb1 <- coef(nb1, matrix = TRUE)
mydiff <- (cnb1["(Intercept)", "log(size)"] - cnb1["(Intercept)", "log(mu)"])
delta0.hat <- exp(mydiff)
(phi.hat <- 1 + 1 / delta0.hat)  # MLE of phi
summary(nb1)
# Obtain a 95 percent confidence interval for phi0:
myvec <- rbind(-1, 1, 0, 0)
(se.mydiff <- sqrt(t(myvec) %*%  vcov(nb1) %*%  myvec))
ci.mydiff <- mydiff + c(-1.96, 1.96) * se.mydiff
ci.delta0 <- ci.exp.mydiff <- exp(ci.mydiff)
(ci.phi0 <- 1 + 1 / rev(ci.delta0)) # The 95 percent conf. interval for phi0

confint_nb1(nb1) # Quick way to get it

summary(glm(y3 ~ x2 + x3, quasipoisson, mydata))$disper # cf. moment estimator

[Package VGAM version 0.8-4 Index]