A Comparative Analysis of Selected Data Mining Algorithms and Programming Languages

Autor

DOI:

https://doi.org/10.15584/jetacomps.2024.5.7

Słowa kluczowe:

Python, R accuracy, mean square error , data mining algorithms

Abstrakt

This paper evaluates the performance of ten selected data mining algorithms in the context of classification and regression and the effectiveness between two popular programming languages used in data science: Python and R. The algorithms included in the study were Naive Bayes Classi fier, K-Nearest Neighbors (k-NN), Support Vector Machine (SVM), Decision Tree, Random Forest, Gradient Boosting Machine (GBM), Logistic Regression, Linear Regression, Ridge Re gression, and LASSO Regression. The study aimed to evaluate how the various algorithms per form in classification and regression tasks in the context of a specific problem, in this case fraud detection. The performance of the algorithms was evaluated based on key metrics such as accura cy, execution time, the difference between the best and worst results, and in terms of mean square error (MSE). Moreover, learning tools such as R and Python enable students not only to perform multidimensional data analysis, but also to predict future trends and changes. The ability to work with data, modelling and visualisation are key competences in the context of many areas of mo dern life and to support the making of accurate business decisions.

Opublikowane

2024-12-19

Jak cytować

DYMORA, P., MAZUREK, M., & SMYŁA, ŁUKASZ. (2024). A Comparative Analysis of Selected Data Mining Algorithms and Programming Languages. Journal of Education, Technology and Computer Science, 5(35), 69–83. https://doi.org/10.15584/jetacomps.2024.5.7

Numer

Dział

SELECTED PROBLEMS OF USING INFORMATION TECHNOLOGY IN EDUCATION

Inne teksty tego samego autora