clv.Scatt {clv} | R Documentation |
Function computes average scattering for clusters.
clv.Scatt(data, clust, dist="euclidean")
data |
|
clust |
integer |
dist |
choosen metric: "euclidean" (default value), "manhattan", "correlation" |
Let scatter for set X assigned as sigma(X) be defined as vector of variances computed for particular dimensions. Average scattering for clusters is defined as:
Scatt
= (1/|C|) * sum{forall i in 1:|C|} ||sigma(Ci)||/||sigma(X)||
where:
|C| | - number of clusters, |
i | - cluster id, |
Ci | - cluster with id 'i', |
X | - set with all objects, |
||x|| | - sqrt(x*x'). |
Standard deviation is defined as:
stdev
= (1/|C|) * sqrt( sum{forall i in 1:|C|} ||sigma(Ci)|| )
As result list
with three values is returned.
Scatt | - average scattering for clusters value, |
stdev | - standard deviation value, |
cluster.center | - numeric matrix where columns
correspond to variables and rows to cluster centers.
|
Lukasz Nieweglowski
M. Haldiki, Y. Batistakis, M. Vazirgiannis On Clustering Validation Techniques, http://citeseer.ist.psu.edu/513619.html
# load and prepare data library(clv) data(iris) iris.data <- iris[,1:4] # cluster data agnes.mod <- agnes(iris.data) # create cluster tree v.pred <- as.integer(cutree(agnes.mod,5)) # "cut" the tree # compute Scatt index scatt <- clv.Scatt(iris.data, v.pred)