weka.classifiers.meta
Class RandomSubSpace

java.lang.Object
  extended by weka.classifiers.Classifier
      extended by weka.classifiers.SingleClassifierEnhancer
          extended by weka.classifiers.IteratedSingleClassifierEnhancer
              extended by weka.classifiers.RandomizableIteratedSingleClassifierEnhancer
                  extended by weka.classifiers.meta.RandomSubSpace
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, CapabilitiesHandler, OptionHandler, Randomizable, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler

public class RandomSubSpace
extends RandomizableIteratedSingleClassifierEnhancer
implements WeightedInstancesHandler, TechnicalInformationHandler

This method constructs a decision tree based classifier that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity. The classifier consists of multiple trees constructed systematically by pseudorandomly selecting subsets of components of the feature vector, that is, trees constructed in randomly chosen subspaces.

For more information, see

Tin Kam Ho (1998). The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20(8):832-844. URL http://citeseer.ist.psu.edu/ho98random.html.

BibTeX:

 @article{Ho1998,
    author = {Tin Kam Ho},
    journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
    number = {8},
    pages = {832-844},
    title = {The Random Subspace Method for Constructing Decision Forests},
    volume = {20},
    year = {1998},
    ISSN = {0162-8828},
    URL = {http://citeseer.ist.psu.edu/ho98random.html}
 }
 

Valid options are:

 -P
  Size of each subspace:
   < 1: percentage of the number of attributes
   >=1: absolute number of attributes
 
 -S <num>
  Random number seed.
  (default 1)
 -I <num>
  Number of iterations.
  (default 10)
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -W
  Full name of base classifier.
  (default: weka.classifiers.trees.REPTree)
 
 Options specific to classifier weka.classifiers.trees.REPTree:
 
 -M <minimum number of instances>
  Set minimum number of instances per leaf (default 2).
 -V <minimum variance for split>
  Set minimum numeric class variance proportion
  of train variance for split (default 1e-3).
 -N <number of folds>
  Number of folds for reduced error pruning (default 3).
 -S <seed>
  Seed for random data shuffling (default 1).
 -P
  No pruning.
 -L
  Maximum tree depth (default -1, no maximum)
Options after -- are passed to the designated classifier.

Version:
$Revision: 1.4 $
Author:
Bernhard Pfahringer (bernhard@cs.waikato.ac.nz), Peter Reutemann (fracpete@cs.waikato.ac.nz)
See Also:
Serialized Form

Constructor Summary
RandomSubSpace()
          Constructor.
 
Method Summary
 void buildClassifier(Instances data)
          builds the classifier.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
 java.lang.String getRevision()
          Returns the revision string.
 double getSubSpaceSize()
          Gets the size of each subSpace, as a percentage of the training set size.
 TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 java.lang.String globalInfo()
          Returns a string describing classifier
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Main method for testing this class.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setSubSpaceSize(double value)
          Sets the size of each subSpace, as a percentage of the training set size.
 java.lang.String subSpaceSizeTipText()
          Returns the tip text for this property
 java.lang.String toString()
          Returns description of the bagged classifier.
 
Methods inherited from class weka.classifiers.RandomizableIteratedSingleClassifierEnhancer
getSeed, seedTipText, setSeed
 
Methods inherited from class weka.classifiers.IteratedSingleClassifierEnhancer
getNumIterations, numIterationsTipText, setNumIterations
 
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, getCapabilities, getClassifier, setClassifier
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

RandomSubSpace

public RandomSubSpace()
Constructor.

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing classifier

Returns:
a description suitable for displaying in the explorer/experimenter gui

getTechnicalInformation

public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Specified by:
getTechnicalInformation in interface TechnicalInformationHandler
Returns:
the technical information about this class

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class RandomizableIteratedSingleClassifierEnhancer
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Valid options are:

 -P
  Size of each subspace:
   < 1: percentage of the number of attributes
   >=1: absolute number of attributes
 
 -S <num>
  Random number seed.
  (default 1)
 -I <num>
  Number of iterations.
  (default 10)
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -W
  Full name of base classifier.
  (default: weka.classifiers.trees.REPTree)
 
 Options specific to classifier weka.classifiers.trees.REPTree:
 
 -M <minimum number of instances>
  Set minimum number of instances per leaf (default 2).
 -V <minimum variance for split>
  Set minimum numeric class variance proportion
  of train variance for split (default 1e-3).
 -N <number of folds>
  Number of folds for reduced error pruning (default 3).
 -S <seed>
  Seed for random data shuffling (default 1).
 -P
  No pruning.
 -L
  Maximum tree depth (default -1, no maximum)
Options after -- are passed to the designated classifier.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class RandomizableIteratedSingleClassifierEnhancer
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the Classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class RandomizableIteratedSingleClassifierEnhancer
Returns:
an array of strings suitable for passing to setOptions

subSpaceSizeTipText

public java.lang.String subSpaceSizeTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getSubSpaceSize

public double getSubSpaceSize()
Gets the size of each subSpace, as a percentage of the training set size.

Returns:
the subSpace size, as a percentage.

setSubSpaceSize

public void setSubSpaceSize(double value)
Sets the size of each subSpace, as a percentage of the training set size.

Parameters:
value - the subSpace size, as a percentage.

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
builds the classifier.

Overrides:
buildClassifier in class IteratedSingleClassifierEnhancer
Parameters:
data - the training data to be used for generating the classifier.
Throws:
java.lang.Exception - if the classifier could not be built successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
preedicted class probability distribution
Throws:
java.lang.Exception - if distribution can't be computed successfully

toString

public java.lang.String toString()
Returns description of the bagged classifier.

Overrides:
toString in class java.lang.Object
Returns:
description of the bagged classifier as a string

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Overrides:
getRevision in class Classifier
Returns:
the revision

main

public static void main(java.lang.String[] args)
Main method for testing this class.

Parameters:
args - the options