Accès chercheur

EEDIS Laboratory

Evolutionary Engineering

and

Distributed Information Systems

Réseaux et Communication

Sécurité et Multimédia

Ingénierie des Connaissances

Data Mining & Web Intelligent

Interopérabilité des Systèmes d’information
& Bases de données

Développement Orienté Service

Feature Selection based Arabic Text Classification using Different Machine Learning Algorithms: Comparative Study

Auteurs: » BENNABI Sakina Rim
» ELBERRICHI Zakaria
Type : Chapitre de Livre
Edition : Proceedings of the 10th Intern ISBN:
Lien : » https://doi.org/10.1145/3447568.3448531
Publié le : 04-06-2020

Feature selection is a method of data pre-processing widely used when mining large data, such as textual classification. Several studies have been conducted to compare the different methods of feature selection applied to corpora in English. Unfortunately, a small number of works concern the Arabic language. This article aims to present a comparative study of different feature selection techniques including: Chi2, the ANOVA method and mutual information, applied on a corpus in Arabic language, while also diversifying the machine learning algorithms (Naive Bayes, SVM and KNN). This experimental study has shown in general that reducing dimensionality with feature selection techniques has slightly affected the performance of textual classification, reducing the size of the corpus by up to 1%.

Tous droits réservés - © 2019 EEDIS Laboratory