fitvario {RandomFields} | R Documentation |
The function estimates arbitrary parameters of a random field specification with various methods.
fitvario(x, y=NULL, z=NULL, T=NULL, data, model, param, lower=NULL, upper=NULL, sill=NA, grid=!missing(gridtriple), gridtriple, ...) fitvario.default(x, y=NULL, z=NULL, T=NULL, data, model, param, grid=!missing(gridtriple), gridtriple=FALSE, trend = NULL, BC.lambda, ## if missing then no BoxCox-Trafo BC.lambdaLB=-10, BC.lambdaUB=10, lower=NULL, upper=NULL, sill=NA, use.naturalscaling=FALSE, PrintLevel, optim.control=NULL, bins=20, nphi=1, ntheta=1, ntime=20, distance.factor=0.5, upperbound.scale.factor=3, lowerbound.scale.factor=3, lowerbound.scale.LS.factor=5, upperbound.var.factor=10, lowerbound.var.factor=100, lowerbound.sill=1E-10, scale.max.relative.factor=1000, minbounddistance=0.001, minboundreldist=0.02, approximate.functioncalls=50, refine.onborder=TRUE, minmixedvar=1/1000, maxmixedvar=1000, pch=RFparameters()$pch, transform=NULL, standard.style=NULL, var.name="X", time.name="T", lsq.methods=c("self", "plain", "sqrt.nr", "sd.inv", "internal"), mle.methods=c("ml"), cross.methods=NULL, users.guess=NULL, only.users = FALSE, Distances=NULL, truedim, solvesigma = NA, # if NA then use algorithm -- ToDo allowdistanceZero = FALSE, na.rm = TRUE)
x |
(n x 2)-matrix of coordinates, or vector of
x-coordinates. All locations must be given explicitely and
cannot be passed via a grid definition
as in |
y |
vector of y coordinates |
z |
vector of z coordinates |
T |
vector of T coordinates; these coordinates are given in
triple notation, see |
data |
vector or matrix of values measured at If an |
model |
string or list;
covariance model, see If |
param |
vector or matrix or NULL.
If vector then
|
lower |
list or vector. Lower bounds for the parameters.
If If |
upper |
list or vector. Upper bounds for the parameters. See also lower. |
sill |
If not |
grid |
boolean. Weather coordinates give a grid |
gridtriple |
boolean. Format, see |
BC.lambda |
a vector of at most two numerical components (just one component
corresponds to two identical ones) which are the parameters of the
box-cox-transformation:
(x^λ_1-1)/λ_1+λ_2
If the model is univariate, the first parameter can be estimated by
using |
BC.lambdaLB |
lower bound for the first box-cox-parameter |
BC.lambdaUB |
upper bound for the first box-cox-parameter |
trend |
If a univariate model is used, the following trend types are possible: In an n-variate model trend can be either a list of n
trends for univariate models or a list of n*d
matrices (d: number of independent realisations) where each
entry of |
... |
arguments as given in |
use.naturalscaling |
logical. Only used if model is given in standard (simple) way.
If |
PrintLevel |
level to which messages are shown. See Details. |
optim.control |
control list for |
bins |
number of bins of the empirical variogram. See Details. |
nphi |
scalar or vector of 2 components.
If it is a vector then the first component gives the first angle
of the xy plane
and the second one gives the number of directions on the half circle.
If scalar then the first angle is assumed to be zero.
Note that a good estimation of the variogramm by LSQ with a
anisotropic model a large value for |
ntheta |
scalar or vector of 2 components. If it is a vector then the first component gives the first angle in the third direction and the second one gives the number of directions on the half circle. If scalar then the first angle is assumed to be zero. Note that a good estimation of the variogramm by LSQ with a
anisotropic model a large value for |
ntime |
scalar or vector of 2 components.
if |
distance.factor |
relative right bound for the bins. See Details. |
upperbound.scale.factor |
relative upper bound for scale in LSQ and MLE. See Details. |
lowerbound.scale.factor |
relative lower bound for scale in MLE. See Details. |
lowerbound.scale.LS.factor |
relative lower bound for scale in LSQ. See Details. |
upperbound.var.factor |
relative upper bound for variance and nugget. See Details. |
lowerbound.var.factor |
relative lower bound for variance. See Details. |
lowerbound.sill |
absolute lower bound for variance and nugget. See Details. |
scale.max.relative.factor |
relative lower bound for scale below which an additional nugget effect is detected. See Details. |
minbounddistance |
absolute distance to the bounds below which a part of the algorithm is considered as having failed. See Details. |
minboundreldist |
relative distance to the bounds below which a part of the algorithm is considered as having failed. See Details. |
approximate.functioncalls |
approximate evaluations of the ML target function on a grid. See Details. |
refine.onborder |
logical.
If |
minmixedvar |
lower bound for variance in a mixed model; so, the covariance model for mixed model part might be calibrated appropriately |
maxmixedvar |
upper bound for variance in a mixed model; so, the covariance model for mixed model part might be calibrated appropriately |
pch |
character shown before evaluating any method;
if |
var.name |
basic name for the coordinates in the formula of the trend. Default: ‘X’ |
time.name |
basic name for the time component in the formula of the trend. Default: ‘X’ |
transform |
function.
Essentially, Note that the mean and the trend of the model can be neither set nor used in
Note further that many internal checks cannot be performed in case
of the very flexible function transform. Hence, it is completely up
to the user to get Default: |
standard.style |
logical or |
lsq.methods |
variants of the least squares fit of the variogram. See Details. |
mle.methods |
variants of the maximum likelihood fit of the covariance function. See Details. |
cross.methods |
Not implemented yet. |
users.guess |
User's guess of the parameters. All the parameters must be given
using the same rules as for either |
only.users |
boolean. If true then only users.guess is used as a starting point for the fitting algorithms |
Distances |
alternatively to coordinates
|
truedim |
see Distances |
solvesigma |
Boolean – experimental stage!
If a mixed effect part is present where the variance
has to be estimated, then this variance parameter is solved
iteratively within the profile likelihood function, if
|
allowdistanceZero |
boolean. If true, then multiple observations are allowed within a single data set. In this case, the coordinates are slightly scattered, so that the points have some tiny distances. |
na.rm |
boolean – experimental stage.
Only the data may have missing values.
If |
The optimisations are performed using optimize
if one
parameter has to be estimated only and optim
, otherwise.
First, by means of various control parameters, see below, the algorithm
first tries to estimate the bounds for the parameters to be estimated,
if the bounds for the parameters are not given.
Independently whether users.guess
is given,
the algorithm guesses initial values for the parameters.
The automatic guess and the user's guess will be called
primitive methods in the following.
Second, the variogram model is fitted by various least squares
methods (according to the value of lsq.methods
) using
the best parameter set among the primitive methods as initial value
if the effective number of parameters is greater than 1.
[Remarks: (i) “best” with respect to the target value of
the respective lsq method;
(ii) the effective number of parameters in the optimisation algorithm
can be smaller than the number of estimated parameters, since in some
cases, some parameters can be calculated explicitely;
relevant for the choice between
optimize
and optim
is the
effective number of parameters; (iii) optim
needs]
Third, the model is fitted by various maximum likelihood methods
(according to the value of mle.methods
) using
the best parameter set among the primitive methods and the lsq methods
as initial value (if the effective number of parameters is greater
than 1).
Comments on specific parameters:
BC.lambda
If you want to estimate BC.lambda
you should assert that all data values are positive;
otherwise errors will probably occur because of the box-cox-transformation.
The second parameter of the box-cox-transformation cannot be estimated since it corresponds to the mean.
So the mean should be estimated instead.
trend
Among the formes mentioned above it is possible to use just one matrix for the trend
instead of a list of identical ones.
lower
The lower bounds are technical bounds that should not
really restrict the domaine of the value.
However, if these values are too small the optimisation
algorithm will frequently run into local minima or get
stuck close the border of the parameter domain.
It is advised to limit seriously the domain of the additional
parameters of the covariance model and/or the total number of
parameters to be estimated, if “many” parameters
of the covariance model are estimated.
If the model is given in standard form, the user may supply
the lower bounds for the whole parameter vector, or only for
the additional form parameters of the model.
The lower bound for the mean will be ignored.
lower
may contain NA
s, then these values
are generated by the
If a nested model is given, the bounds may again be supplied for all parameters or only for the additional form parameters of the model. The bounds given apply uniformely to all submodels of the nested model.
If the model
is given in list format, then
lower
is a list, where components may be missing
or NA
. These are generated by the algorithm, then.
If lower
is NULL
all lower bounds are generated
automatically.
upper.kappa
See lower.kappa
.
sill
Additionally to estimating nugget
and variance
separately, they may also be estimated together under the
condition that nugget
+ variance
= sill
.
For the latter a finite value for sill
has to be supplied,
and nugget
and variance
are set to NA
.
sill
is only used for the standard model.
use.naturalscaling
logical. If TRUE
then internally, rescaled
covariance functions will be used for which
cov(1)~=0.05. However this parameter
does not influence
the output of fitvario
: the parameter vector
returned by fitvario
refers
always to the standard covariance model as given in
CovarianceFct
. (In contrast to PracticalRange
in RFparameters
.)
Advantages if use.naturalscaling=TRUE
:
scale
and the shape parameter of a parameterised
covariance model can be estimated better if they are estimated
simultaneously.
The estimated bounds calculated by means of
upperbound.scale.factor
and lowerbound.scale.factor
,
etc. might be more realistic.
in case of anisotropic models, the inverse of the elements of the anisotropy matrix should be in the above bounds.
Disadvantages if use.naturalscaling=TRUE
:
For some covariance models with additional parameters, the
rescaling factor has to be determined numerically.
Then, more time is needed to perform fitvario
.
Default: TRUE
.
PrintLevel
0 : no message
1 : error messages
2 : warnings
3 : minimum debugging information
5 : extended debugging information, including graphics
Default: 0
.
trace.optim
see control parameter trace
of
optim
. Default: 0
.
bins
vector of explicit boundaries for the bins or the
number of bins for the empirical variogram (used in the
LSQ target function, which is described at the beginning
of the Details).
Note that for anisotropic models, the value of bins
might
be enlarged.
Default: 20
.
distance.factor
right boundary of the last
bin is calculated as distance.factor
* (maximum distance
between all pairs of points). Only used if bins
is a scalar.
Default: 0.5
.
upperbound.scale.factor
The upper bound for the scale is determined
as upperbound.scale.factor
* (maximum distance
between all pairs of points).
Default: 10
.
lowerbound.scale.factor
The lower bound for the scale is determined
as
.
Default: 20
.
lowerbound.scale.LS.factor
For the LSQ target function a different lower bound
for the scale is used. It is determined as
.
Default: 5
.
upperbound.var.factor
The upper bound for the variance and the nugget is determined
as
\code{upperbound.var.factor} * var(\code{data}).
Default: 10
.
lowerbound.var.factor
The lower bound for the variance and the nugget is determined
as
var(\code{data}) / \code{lowerbound.var.factor}.
If a standard model definition is given and
either the nugget or the variance is fixed,
the parameter to be estimated
must also be greater than lowerbound.sill
.
If a non-standard model definition is given
then lowerbound.var.factor
is only used
for the first model; the other lower bounds for the
variance are zero.
Default: 100
.
lowerbound.sill
See lowerbound.var.factor
.
Default: 1E-10
.
scale.max.relative.factor
If the initial scale value for the ML estimation
obtained by the LSQ target function is
less than
a warning is given that probably a nugget effect
is present.
Note: if scale.max.relative.factor
is greater
than lowerbound.scale.LS.factor
then
no warning is given as
the scale has the lower bound (minimum distance
between different pairs of points) /
lowerbound.scale.LS.factor
.
Default: 1000
.
minbounddistance
If any value of the parameter vector
returned from the ML estimation
is closer than minbounddistance
to any of the bounds or if any value
has a relative distance smaller than
minboundreldist
, then it is assumed that
the MLE algorithm has dropped into a local minimum,
and it will be continued with evaluating the
ML target function on a grid, cf. the beginning paragraphs
of the Details.
Default: 0.001
.
minboundreldist
See minbounddistance
.
Default: 0.02
.
approximate.functioncalls
In case the parameter vector is too close to the given
bounds, the ML target function is evaluated on a grid
to get a new initial value for the ML estimation.
The number of points of the grid is approximately
approximate.functioncalls
.
Default: 50
.
lsq.methods
Variogram fit by least squares methods; first, a preliminary
trend is estimated
by a simple regression; second, the variogram is fitted; third,
the trend is fitted using the estimated covariance structure.
"self"
weighted lsq. Weights are the values of the
fitted variogram to the power of -2
"plain"
model fitted by least squares; trends are never taken
into account
"sqrt.nr"
weighted lsq. Weight is the square root
of the number
of points in the bin
"sd.inv"
weighted lsq. Weight is the inverse the
standard deviation of the
variogram cloud within the bin
mle.methods
Model fit by various maximum likelihood methods
(according to the value of mle.methods
) using
the best parameter set among the primitive methods and the lsq methods
as initial value (if the effective number of parameters is greater
than 1). If the best parameter vector of the MLE found so far is too close
to some given bounds, see the specific parameters above, it is
assumed that
optim
ran into a local minimum because of a bad starting
value.
In this case and if refine.onborder=TRUE
the MLE target function is calculated on a grid, the
best parameter vector is taken, and the optimisation is restarted with
this parameter vector.
"ml"
maximum likelihood;
since ML and REML give the same result if there are not any
covariates, ML is performed in that case, independently whether
it is given or not.
"reml"
restricted maximum likelihood
The function returns a list with the following elements
ev |
list returned by |
table |
Matrix. The first rows contain the estimated parameters, followed by the target values of all methods for the given set of parameters; the last rows give the lower and upper bounds used in the estimations. The columns correspond to the various estimation methods for the parameters. |
lowerbounds |
lower bounds |
lowerbounds |
upper bounds |
transform |
transformation function |
vario |
obsolete |
self |
list containing
|
plain, sqrt.nr, sd.inv, internal, ml, reml |
see |
Thanks to Paulo Ribeiro for hints and comparing the perliminary versions
of fitvario
in RandomFields V1.0 to
likfit
of the package geoR
whose homepage is at
http://www.est.ufpr.br/geoR/.
This function does not depend on the value of
RFparameters
()$PracticalRange
.
The function fitvario
always uses the standard specification
of the covariance model as given in CovarianceFct
.
Further, the function has implemented accelerations if the model
is simple. E.g., if there is a common variance to estimated
and the definition by lists is used, then the leading model should
be ‘$’ with var=NA
.
Martin Schlather, martin.schlather@math.uni-goettingen.de http://www.stochastik.math.uni-goettingen.de/~schlather
Least squares and mle methods
Cressie, N.A.C. (1993) Statistics for Spatial Data. New York: Wiley.
Related software
Ribeiro, P. and Diggle, P. (2001) Software for geostatistical analysis
using R and S-PLUS: geoR and geoS, version 0.6.15.
http://www.maths.lancs.ac.uk/~ribeiro/geoR.html.
REML (rml)
LaMotte, L.R. (2007) A direct derivation of the REML likelihood function Statistical Papers 48, 321-327.
Covariance
,
CovarianceFct
,
GetPracticalRange
,
parampositions
RandomFields
,
weather
.
model <-"gencauchy" param <- c(0, 1, 0, 1, 1, 2) estparam <- c(0, NA, 0, NA, NA, 2) ## NA means: "to be estimated" ## sequence in `estparam' is ## mean, variance, nugget, scale, (+ further model parameters) ## So, mean, variance, and scale will be estimated here. ## Nugget is fixed and equals zero. points <- 100 x <- runif(points,0,3) y <- runif(points,0,3) ## 300 random points in square [0, 3]^2 ## simulate data according to the model: d <- GaussRF(x=x, y=y, grid=FALSE, model=model, param=param, n=1000) #1000 ## fit the data: Print(fitvario(x=cbind(x,y), data=d, model=model, param=estparam, lower=c(0.1, 0.1, 0.1), upper=c(1.9, 5, 2))) ######################################################### ## The next two estimations give about the same result. ## For the first the sill is fixed to 1.5. For the second the sill ## is reached if the estimated variance is smaller than 1.5 estparam <- c(0, NA, NA, NA, NA, NA) ## Not run: Print(v <- fitvario(x=cbind(x,y), data=d, model=model, param=estparam, sill=1, use.nat=FALSE)) ## gencauchy works better with use.nat=FALSE ## End(Not run) estmodel <- list("+", list("$", var=NA, scale=NA, list("gencauchy", alpha=NA, beta=NA) ), list("$", var=NA, list("nugget")) ) parampositions(model=estmodel, dim=2) f <- function(variab) c(variab, max(0, 1.0 - variab[1])) ## Not run: Print(v2 <- fitvario(x=cbind(x,y), data=d, model=estmodel, lower = c(TRUE, TRUE, TRUE, TRUE, FALSE), transform=f, use.nat=FALSE)) ## End(Not run) ######################################################### ## estimation of coupled parameters (alpha = beta, here) # source("RandomFields/tests/source.R") f <- function(param) param[c(1:3,3,4)] ## Not run: Print(fitvario(x=cbind(x,y), data=d, model=estmodel, lower=c(TRUE, TRUE, TRUE, FALSE, TRUE), transform=f)) ## End(Not run) ######################################################### ## estimation in a anisotropic framework x <- y <- (1:6)/4 model <- list("$", aniso=matrix(nc=2, c(4,2,-2,1)), var=1.5, list("exp")) z <- GaussRF(x=x, y=y, grid=TRUE, model=model, n=10) estmodel <-list("$", aniso=matrix(nc=2, c(NA,NA,-2,1)), var=NA, list("exp")) Print(fitvario(as.matrix(expand.grid(x, y)), data=z, model=estmodel, nphi=20)) ######################################################### ## estimation with trend (formula) model <- list("$", var=1, scale=2, list("gauss")) estmodel <- list("$", var=NA, scale=NA, list("gauss")) x <- seq(-pi,pi,pi/2) n <- 5 data <- GaussRF(x, x, gridtri=FALSE, model=model, trend=function(X1,X2) sin(X1) + 2*cos(X2),n=n) Print(v <- fitvario(x, x, data=data, gridtrip=FALSE, model=estmodel, trend=~sin(X1)+cos(X1)+sin(X2)+cos(X2))) ######################################################## ## estimation of anisotropy matrix with two identical ## ## diagonal elements ## ## Not run: x <- c(0, 5, 0.4) model <- list("$", var=1, scale=1, list("exponential")) z <- GaussRF(x, x, x, model=model, gridtriple=TRUE, n=10, Print=2) est.model <- list("+", list("$", var=NA, aniso=diag(c(NA, NA, NA)), list("exponen")), list("$", var=NA, list("nugget"))) parampositions(est.model, dim=3) trafo <- function(variab) {variab[c(1:2, 2:4)]} lower <- c(TRUE, TRUE, FALSE, TRUE, TRUE) # which parameter to be estimated fitlog <- fitvario(x, x, x, gridtriple=TRUE, data=z, model=est.model, transform=trafo, lower=lower) str(fitlog$ml) ## End(Not run)