Supervised Learning Algorithm on Unstructured Documents for the Classification of Job Offers: Case of Cameroun

Makembe, Fritz Sosso and Etoundi, Roger Atsa and Tapamo, Hippolyte (2023) Supervised Learning Algorithm on Unstructured Documents for the Classification of Job Offers: Case of Cameroun. Journal of Computer and Communications, 11 (02). pp. 75-88. ISSN 2327-5219

[thumbnail of jcc_2023022415270778.pdf] Text
jcc_2023022415270778.pdf - Published Version

Download (1MB)

Abstract

Nowadays, in data science, supervised learning algorithms are frequently used to perform text classification. However, African textual data, in general, have been studied very little using these methods. This article notes the particularity of the data and measures the level of precision of predictions of naive Bayes algorithms, decision tree, and SVM (Support Vector Machine) on a corpus of computer jobs taken on the internet. This is due to the data imbalance problem in machine learning. However, this problem essentially focuses on the distribution of the number of documents in each class or subclass. Here, we delve deeper into the problem to the word count distribution in a set of documents. The results are compared with those obtained on a set of French IT offers. It appears that the precision of the classification varies between 88% and 90% for French offers against 67%, at most, for Cameroonian offers. The contribution of this study is twofold. Indeed, it clearly shows that, in a similar job category, job offers on the internet in Cameroon are more unstructured compared to those available in France, for example. Moreover, it makes it possible to emit a strong hypothesis according to which sets of texts having a symmetrical distribution of the number of words obtain better results with supervised learning algorithms.

Item Type: Article
Subjects: Institute Archives > Medical Science
Depositing User: Managing Editor
Date Deposited: 18 Apr 2023 04:42
Last Modified: 14 Sep 2023 07:45
URI: http://eprint.subtopublish.com/id/eprint/2015

Actions (login required)

View Item
View Item