MLPACK
1.0.10
|
This class implements a decision stump. More...
Public Member Functions | |
DecisionStump (const MatType &data, const arma::Row< size_t > &labels, const size_t classes, size_t inpBucketSize) | |
Constructor. More... | |
DecisionStump (const DecisionStump<> &ds) | |
const arma::Col< size_t > | BinLabels () const |
Access the labels for each split bin. More... | |
arma::Col< size_t > & | BinLabels () |
Modify the labels for each split bin (be careful!). More... | |
void | Classify (const MatType &test, arma::Row< size_t > &predictedLabels) |
Classification function. More... | |
const arma::vec & | Split () const |
Access the splitting values. More... | |
arma::vec & | Split () |
Modify the splitting values (be careful!). More... | |
int | SplitAttribute () const |
ModifyData(MatType& data, const arma::Row<double>& D);. More... | |
int & | SplitAttribute () |
Modify the splitting attribute (be careful!). More... | |
Private Member Functions | |
template<typename AttType , typename LabelType > | |
double | CalculateEntropy (arma::subview_row< LabelType > labels) |
Calculate the entropy of the given attribute. More... | |
template<typename rType > | |
rType | CountMostFreq (const arma::Row< rType > &subCols) |
Count the most frequently occurring element in subCols. More... | |
template<typename rType > | |
int | IsDistinct (const arma::Row< rType > &featureRow) |
Returns 1 if all the values of featureRow are not same. More... | |
void | MergeRanges () |
After the "split" matrix has been set up, merge ranges with identical class labels. More... | |
double | SetupSplitAttribute (const arma::rowvec &attribute, const arma::Row< size_t > &labels) |
Sets up attribute as if it were splitting on it and finds entropy when splitting on attribute. More... | |
template<typename rType > | |
void | TrainOnAtt (const arma::rowvec &attribute, const arma::Row< size_t > &labels) |
After having decided the attribute on which to split, train on that attribute. More... | |
Private Attributes | |
arma::Col< size_t > | binLabels |
Stores the labels for each splitting bin. More... | |
size_t | bucketSize |
Size of bucket while determining splitting criterion. More... | |
size_t | numClass |
Stores the number of classes. More... | |
arma::vec | split |
Stores the splitting values after training. More... | |
int | splitAttribute |
Stores the value of the attribute on which to split. More... | |
This class implements a decision stump.
It constructs a single level decision tree, i.e., a decision stump. It uses entropy to decide splitting ranges.
The stump is parameterized by a splitting attribute (the dimension on which points are split), a vector of bin split values, and a vector of labels for each bin. Bin i is specified by the range [split[i], split[i + 1]). The last bin has range up to (split[i + 1] does not exist in that case). Points that are below the first bin will take the label of the first bin.
MatType | Type of matrix that is being used (sparse or dense). |
Definition at line 44 of file decision_stump.hpp.
mlpack::decision_stump::DecisionStump< MatType >::DecisionStump | ( | const MatType & | data, |
const arma::Row< size_t > & | labels, | ||
const size_t | classes, | ||
size_t | inpBucketSize | ||
) |
Constructor.
Train on the provided data. Generate a decision stump from data.
data | Input, training data. |
labels | Labels of training data. |
classes | Number of distinct classes in labels. |
inpBucketSize | Minimum size of bucket when splitting. |
mlpack::decision_stump::DecisionStump< MatType >::DecisionStump | ( | const DecisionStump<> & | ds | ) |
|
inline |
Access the labels for each split bin.
Definition at line 100 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::binLabels.
|
inline |
Modify the labels for each split bin (be careful!).
Definition at line 102 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::binLabels.
|
private |
Calculate the entropy of the given attribute.
attribute | The attribute of which we calculate the entropy. |
labels | Corresponding labels of the attribute. |
void mlpack::decision_stump::DecisionStump< MatType >::Classify | ( | const MatType & | test, |
arma::Row< size_t > & | predictedLabels | ||
) |
Classification function.
After training, classify test, and put the predicted classes in predictedLabels.
test | Testing data or data to classify. |
predictedLabels | Vector to store the predicted classes after classifying test data. |
|
private |
Count the most frequently occurring element in subCols.
subCols | The vector in which to find the most frequently occurring element. |
|
private |
Returns 1 if all the values of featureRow are not same.
featureRow | The attribute which is checked for identical values. |
|
private |
After the "split" matrix has been set up, merge ranges with identical class labels.
|
private |
Sets up attribute as if it were splitting on it and finds entropy when splitting on attribute.
attribute | A row from the training data, which might be a candidate for the splitting attribute. |
|
inline |
Access the splitting values.
Definition at line 95 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::split.
|
inline |
Modify the splitting values (be careful!).
Definition at line 97 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::split.
|
inline |
ModifyData(MatType& data, const arma::Row<double>& D);.
Access the splitting attribute.
Definition at line 90 of file decision_stump.hpp.
|
inline |
Modify the splitting attribute (be careful!).
Definition at line 92 of file decision_stump.hpp.
References mlpack::decision_stump::DecisionStump< MatType >::splitAttribute.
|
private |
After having decided the attribute on which to split, train on that attribute.
attribute | attribute is the attribute decided by the constructor on which we now train the decision stump. |
|
private |
Stores the labels for each splitting bin.
Definition at line 118 of file decision_stump.hpp.
Referenced by mlpack::decision_stump::DecisionStump< MatType >::BinLabels().
|
private |
Size of bucket while determining splitting criterion.
Definition at line 112 of file decision_stump.hpp.
|
private |
Stores the number of classes.
Definition at line 106 of file decision_stump.hpp.
|
private |
Stores the splitting values after training.
Definition at line 115 of file decision_stump.hpp.
Referenced by mlpack::decision_stump::DecisionStump< MatType >::Split().
|
private |
Stores the value of the attribute on which to split.
Definition at line 109 of file decision_stump.hpp.
Referenced by mlpack::decision_stump::DecisionStump< MatType >::SplitAttribute().