grc {VGAM} | R Documentation |
Fits a Goodman's RC association model to a matrix of counts, and more generally, a sub-class of row-column association models.
grc(y, Rank = 1, Index.corner = 2:(1 + Rank), szero = 1, summary.arg = FALSE, h.step = 1e-04, ...) rcam(y, family = poissonff, Rank = 0, Musual = NULL, weights = NULL, Index.corner = if (!Rank) NULL else 1 + Musual * (1:Rank), rprefix = "Row.", cprefix = "Col.", szero = if (!Rank) NULL else { if (Musual == 1) 1 else setdiff(1:(Musual * ncol(y)), c(1 + (1:ncol(y)) * Musual, Index.corner)) }, summary.arg = FALSE, h.step = 0.0001, rbaseline = 1, cbaseline = 1, ...)
y |
For |
family |
A VGAM family function.
The first linear/additive predictor is fitted using main effects plus
an optional rank- |
Rank |
An integer from the set
{0,..., |
weights |
|
Index.corner |
A vector of |
rprefix, cprefix |
Character, for rows and columns resp. For labelling the indicator variables. |
szero |
An integer from the set {1,..., |
summary.arg |
Logical. If |
h.step |
A small positive value that is passed into
|
... |
Arguments that are passed into |
Musual |
The number of linear predictors of the VGAM |
rbaseline, cbaseline |
Baseline reference levels for the rows and columns. Currently stored on the object but not used. |
Goodman's RC association model fits a reduced-rank approximation
to a table of counts. The log of each cell mean is decomposed as an
intercept plus a row effect plus a column effect plus a reduced-rank
component. The latter can be collectively written A %*% t(C)
,
the product of two ‘thin’ matrices.
Indeed, A
and C
have Rank
columns.
By default, the first column and row of the interaction matrix
A %*% t(C)
is chosen
to be structural zeros, because szero = 1
.
This means the first row of A
are all zeros.
This function uses options()$contrasts
to set up the row and
column indicator variables.
In particular, Equation (4.5) of Yee and Hastie (2003) is used.
These are called Row.
and Col.
(by default) followed
by the row or column number.
The function rcam()
is more general than grc()
.
Its default is a no-interaction model of grc()
, i.e.,
rank-0 and a Poisson distribution. This means that each
row and column has a dummy variable associated with it.
The first row and column is baseline.
The power of rcam()
is that many VGAM family functions
can be assigned to its family
argument.
For example,
normal1
fits something in between a 2-way
ANOVA with and without interactions,
alaplace2
with Rank = 0
is something like
medpolish
.
Others include
zipoissonff
,
negbinomial
.
Hopefully one day all VGAM family functions will
work when assigned to the family
argument, although the
result may not have meaning.
An object of class "grc"
, which currently is the same as
an "rrvglm"
object.
Currently,
a rank-0 rcam()
object is of class rcam0-class
,
else of class "rcam"
(this may change in the future).
The function rcam()
is experimental at this stage and
may have bugs.
Quite a lot of expertise is needed when fitting and in its
interpretion thereof. For example, the constraint
matrices applies the reduced-rank regression to the first linear
predictor and the other linear predictors are intercept-only and
have a common value throughout the entire data set.
This means that family =
zipoissonff
is
appropriate but not
family =
zipoisson
.
To understand what is going on, do examine the constraint
matrices of the fitted object, and reconcile this with Equations
(4.3) to (4.5) of Yee and Hastie (2003).
The functions temporarily create a permanent data frame
called .grc.df
or .rcam.df
, which used
to be needed by summary.rrvglm()
. Then these
data frames are deleted before exiting the function.
If an error occurs, then the data frames may be present
in the workspace.
These functions set up the indicator variables etc. before calling
rrvglm
or
vglm
.
The ...
is passed into rrvglm.control
or
vglm.control
,
This means, e.g., Rank = 1
is default for grc()
.
The data should be labelled with rownames
and
colnames
.
Setting trace = TRUE
is recommended for monitoring
convergence.
Using criterion = "coefficients"
can result in slow convergence.
If summary = TRUE
, then y
can be a
"grc"
object, in which case a summary can be
returned. That is, grc(y, summary = TRUE)
is
equivalent to summary(grc(y))
.
Thomas W. Yee, with assistance from Alfian F. Hadi.
Yee, T. W. and Hastie, T. J. (2003) Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15–41.
Goodman, L. A. (1981) Association models and canonical correlation in the analysis of cross-classifications having ordered categories. Journal of the American Statistical Association, 76, 320–334.
Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information about the setting up of the indicator variables.
rrvglm
,
rrvglm.control
,
rrvglm-class
,
summary.grc
,
moffset
,
Rcam
,
Qvar
,
plotrcam0
,
alcoff
,
crashi
,
auuc
,
olympic
,
poissonff
.
# Some undergraduate student enrolments at the University of Auckland in 1990 grc1 <- grc(auuc) fitted(grc1) summary(grc1) grc2 <- grc(auuc, Rank = 2, Index.corner = c(2, 5)) fitted(grc2) summary(grc2) # 2008 Summer Olympic Games in Beijing top10 <- head(olympic, n = 10) oly1 <- with(top10, grc(cbind(gold, silver, bronze))) round(fitted(oly1)) round(resid(oly1, type = "response"), dig = 1) # Response residuals summary(oly1) Coef(oly1) # Roughly median polish rcam0 <- rcam(auuc, fam = alaplace2(tau = 0.5, intparloc = TRUE), trace = TRUE) round(fitted(rcam0), dig = 0) round(100 * (fitted(rcam0) - auuc) / auuc, dig = 0) # Discrepancy rcam0@y round(coef(rcam0, matrix = TRUE), dig = 2) print(Coef(rcam0, matrix = TRUE), dig = 3) # constraints(rcam0) names(constraints(rcam0)) # Compare with medpolish(): (med.a <- medpolish(auuc)) fv <- med.a$overall + outer(med.a$row, med.a$col, "+") round(100 * (fitted(rcam0) - fv) / fv) # Hopefully should be all 0s