Auteurs: | » Mohammed MERABET |
Type : | Conférence Internationale |
Nom de la conférence : | Artificial Intelligence and Heuristics for Smart Energy Efficiency in Smart Cities. |
Lieu : | Pays: |
Lien : » https://link.springer.com/chapter/10.1007/978-3-030-92038-8_68 | |
Publié le : | 25-11-2021 |
Nowadays, churn prediction is the most common Big Data application in the telecoms industry, However, the characteristic of unbalanced classes in this kind of application opens a possibility to explore unbalanced data handling techniques. Apache Spark is a Big Data platform designed for fast and distributed massive data processing. To the best of our knowledge, no studies have been conducted on the use of Edited Nearest Neighbors (ENN) for handling imbalanced data in the Spark context. Therefore, this work aims to investigate the effects of using several techniques with SVM to balance datasets. The experimentations show ENN combined with SVM can reach significantly higher values than others for AUC and F1-score metrics.