biglm.big.matrix, bigglm.big.matrix {bigmemory} | R Documentation |
This is a wrapper to Thomas Lumley's biglm
package, allowing its use with data stored in big.matrix
objects.
biglm.big.matrix(formula, data, chunksize=NULL, ..., fc=NULL, getNextChunkFunc=NULL) bigglm.big.matrix(formula, data, chunksize=NULL, ..., fc=NULL, getNextChunkFunc=NULL)
formula |
a model formula . |
data |
a big.matrix or data.frame object. |
chunksize |
an integer maximum size of chunks of data to process iteratively; if this argument is not given, a suitable default is supplied |
fc |
the names of variables that are factors |
getNextChunkFunc |
a function which generates the next set of indices for the next chunk; if this argument is not given, a suitable default is supplied |
... |
the other parameters which can be specified are those supported by biglm and bigglm |
See biglm package for more information; chunksize
defaults to
floor(nrow(data)/ncol(data)^2)
.
an object of class biglm
.
Michael J. Kane
Algorithm AS274 Applied Statistics (1992) Vol. 41, No.2
Thomas Lumley (2005). biglm: bounded memory linear and generalized linear models. R package version 0.7.
# This example is quite silly, using the iris # data. But it shows that our wrapper to Lumley's biglm() function produces # the same answer as the plain old lm() function. ## Not run: x <- matrix(unlist(iris), ncol=5) colnames(x) <- names(iris) x <- as.big.matrix(x) head(x) silly.biglm <- biglm.big.matrix(Sepal.Length ~ Sepal.Width + Species, data=x, fc="Species") summary(silly.biglm) y <- data.frame(x[,]) y$Species <- as.factor(y$Species) head(y) silly.lm <- lm(Sepal.Length ~ Sepal.Width + Species, data=y) summary(silly.lm) ## End(Not run)