If you have stumbled on this page, I assume there is still no Package in R for Hellinger Decision Trees.
When I wanted to build a model using Hellinger Tree, my natural instinct was to search for a package in R. When I found none, I searched for its implementation in any other Language, I couldn’t find any implementation snippets. So, this is my post that could save your valuable time.
So, using this jar, I had updated the base Weka jar and found the Hellinger Tree under the classify tab in Weka Explorer but couldn’t work with it somehow, Maybe because I was new to Weka. I didn’t have the time to learn Weka and I needed to build the model ASAP with only hellinger distance decision tree and quickly. I used the jar to build the tree in our old dependable friend JAVA.
Weka classifiers work best with .arff files. Data can be supplied to them in other formats but arff files are native to Weka. We can use RWeka package in R to export training, testing and other datasets in .arff files.
library(RWeka) write.arff(data_train,"train.arff",eol = "\n")
Now for the sample JAVA code for Hellinger Tree based Model
import weka.classifiers.Evaluation;
import weka.classifiers.trees.HTree;
import weka.core.Instance;
import weka.core.Instances;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.BufferedReader;
import java.util.Random;
public class hellingerDT {
/**
* @param args
* @throws FileNotFoundException
*/
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
//Reading data from training, test, classify arff files
BufferedReader breader = null;
breader = new BufferedReader(new FileReader("location/train.arff"));
Instances train = new Instances (breader);
//Setting the first column as class variable, Indicating to the model that this is the
train.setClassIndex(0);
breader = new BufferedReader(new FileReader("location/test.arff"));
Instances test = new Instances (breader);
test.setClassIndex(0);
breader = new BufferedReader(new FileReader("location/classify.arff"));
Instances classify = new Instances (breader);
classify.setClassIndex(0);
breader.close();
//Instantiate a Hellinger Tree model
HTree hT = new HTree();
//Train the model
hT.buildClassifier(train);
//Evaluating a model, using test data
Evaluation eval = new Evaluation(train);
eval.evaluateModel(hT, test);
//Display the metrics
System.out.println(eval.toSummaryString("Review Classification Hellinger Tree", true));
System.out.println("Precision " + eval.precision(1)*100+" and Recall "+eval.recall(1)*100);
//Printing the Tree
System.out.println(hT.graph());
//Classifying New Data
for (int i = 0; i < classify.numInstances(); i++) {
double pred = hT.classifyInstance(classify.instance(i));
System.out.println(pred);
}
}
}