Accès chercheur

EEDIS Laboratory

Evolutionary Engineering

and

Distributed Information Systems

Réseaux et Communication

Sécurité et Multimédia

Ingénierie des Connaissances

Data Mining & Web Intelligent

Interopérabilité des Systèmes d’information
& Bases de données

Développement Orienté Service

Evaluation and comparison of concept based and n-grams based text clustering using SOM

Auteurs: » Abdelmalek Amine
» ELBERRICHI Zakaria
» Michel Simonet
» Malki Mimoun
Type : Revue Internationale
Nom du journal : INFOCOMP Journal of Computer Science ISSN:
Volume : 7 Issue: 1 Pages: 27-35
Lien : »
Publié le : 01-03-2008

With the great and rapidly growing number of documents available in digital form (Internet, library, CD-Rom…), the automatic classification of texts has become a significant research field and a fundamental task in document processing. This paper deals with unsupervised classification of textual documents also called text clustering using Self-Organizing Maps of Kohonen in two new situations: a conceptual representation of texts and a representation based on n-grams, instead of a representation based on words. The effects of these combinations are examined in several experiments using 4 measurements of similarity. The Reuters-21578 corpus is used for evaluation. The evaluation was done by using the F-measure and the entropy.

Tous droits réservés - © 2019 EEDIS Laboratory