A Comparative Analysis of Selected Data Mining Algorithms and Programming Languages
DOI:
https://doi.org/10.15584/jetacomps.2024.5.7Słowa kluczowe:
Python, R accuracy, mean square error , data mining algorithmsAbstrakt
This paper evaluates the performance of ten selected data mining algorithms in the context of classification and regression and the effectiveness between two popular programming languages used in data science: Python and R. The algorithms included in the study were Naive Bayes Classi fier, K-Nearest Neighbors (k-NN), Support Vector Machine (SVM), Decision Tree, Random Forest, Gradient Boosting Machine (GBM), Logistic Regression, Linear Regression, Ridge Re gression, and LASSO Regression. The study aimed to evaluate how the various algorithms per form in classification and regression tasks in the context of a specific problem, in this case fraud detection. The performance of the algorithms was evaluated based on key metrics such as accura cy, execution time, the difference between the best and worst results, and in terms of mean square error (MSE). Moreover, learning tools such as R and Python enable students not only to perform multidimensional data analysis, but also to predict future trends and changes. The ability to work with data, modelling and visualisation are key competences in the context of many areas of mo dern life and to support the making of accurate business decisions.
Pobrania
Opublikowane
Jak cytować
Numer
Dział
Licencja
Prawa autorskie (c) 2024 Journal of Education, Technology and Computer Science
Utwór dostępny jest na licencji Creative Commons Uznanie autorstwa – Użycie niekomercyjne – Bez utworów zależnych 4.0 Międzynarodowe.