我使用Weka API和Java创建了一个决策树(J48)。首先,我使用arff文件训练我的决策树。
public static void Tree(String Path) throws Exception {//Path path for the arff file
J48 tree = new J48(); // new instance of tree
DataSource source = new DataSource(Path);
Instances data = source.getDataSet();
// setting class attribute if the data format does not provide this information
// For example, the XRFF format saves the class attribute information as well
if (data.classIndex() == -1) {
data.setClassIndex(data.numAttributes() - 1);
}
tree.buildClassifier(data);
System.out.println(tree.toString());
}
使用过的arff文件包含780个实例。每个实例都有6个属性{PT1,w1,d1,PT2,w2,d2}都是数字和一个类{yes,no}。我的代码正在运行,我可以看到生成的决策树使用
System.out.println(tree.toString());
现在,我想创建一个新的实例(不使用另一个arff文件)并对这个新实例进行分类。假设这个新实例的值是,例如,{50,5,800,74,3,760}。然后决策树必须返回相应的类(“是”或“否”)。
我找到了解决问题的方法,我希望它会有用。
//Declaring attributes
Attribute PT1 = new Attribute("PT1");
Attribute w1 = new Attribute("w1");
Attribute d1 = new Attribute("d1");
Attribute PT2 = new Attribute("PT2");
Attribute w2 = new Attribute("w2");
Attribute d2 = new Attribute("d2");
// Declare the class attribute along with its values contains two nominal values yes and no using FastVector. "ScheduledFirst" is the name of the class attribute
FastVector fvClassVal = new FastVector(2);
fvClassVal.addElement("yes");
fvClassVal.addElement("no");
Attribute Class = new Attribute("ScheduledFirst", fvClassVal);
// Declare the feature vector
FastVector fvWekaAttributes = new FastVector(7);
// Add attributes
fvWekaAttributes.addElement(PT1);
fvWekaAttributes.addElement(w1);
fvWekaAttributes.addElement(d1);
fvWekaAttributes.addElement(PT2);
fvWekaAttributes.addElement(w2);
fvWekaAttributes.addElement(d2);
fvWekaAttributes.addElement(Class);
// Declare Instances which is required since I want to use classification/Prediction
Instances dataset = new Instances("whatever", fvWekaAttributes, 0);
//Creating a double array and defining values
double[] attValues = new double[dataset.numAttributes()];
attValues[0] = 50;
attValues[1] = 5;
attValues[2] = 800;
attValues[3] = 74;
attValues[4] = 3;
attValues[5] = 760;
//Create the new instance i1
Instance i1 = new Instance(1.0, attValues);
//Add the instance to the dataset (Instances) (first element 0)
dataset.add(i1);
//Define class attribute position
dataset.setClassIndex(dataset.numAttributes()-1);
//Will print 0 if it's a "yes", and 1 if it's a "no"
System.out.println(tree.classifyInstance(dataset.instance(0)));
//Here I call dataset.instance(0) since there is only one instance added in the dataset, if you do add another one you can use dataset.instance(0), etc.