A Comparison of Filter and Wrapper Approaches with Text Mining for
Text Classification
ประเภทบทความ :
บทความวิชาการ
หมวดหมู่ :
วิศวกรรมคอมพิวเตอร์
The main problem for text categorization is the highest dimensionality of feature space. Many
researchers focus on instruction feature selection techniques to represent a document which in turn, increases the
overall efficiency of a classification model. There are two general feature selection approaches: the Filter approach
and the Wrapper approach. The Filter approach used Information Gain, Gain Ratio and Chi-square. The results
showed that Chi-Square had highest performance with F-measure equaling 92.2%, the Wrapper approach used
Support Vector Machine consisting of Genetic Algorithm (SVMGA) and Greedy (SVMGD). The results also found
that Greedy (SVMGD) was the best algorithm with F-measure which equaled 94%. Both feature selection
approaches employed Support Vector Machine with kernel Radial basis function as a classifier. When comparing
the effectiveness of Filter approaches to Wrapper approaches, evaluated via F-measure shown that the value of
Wrapper approaches were higher than that of Filter approaches at 1.8%. In conclusion, this technique enables
researchers to increase the efficiency of a wrapper approach when implemented for information classification.
วันที่ลง01/07/2019
299
