Kanishchev Ilya Sergeevich (Postgraduate student, Vyatka State University, Kirov)
|
Missing values is considered one of the biggest challenges faced by machine learning models. It can be exacerbated by the presence of imbalanced data. Several methods have been proposed and compared, such as pattern approximation, but they do not account for the adverse conditions found in real-world databases. This paper presents a comparison of the techniques used to classify a record from a real unbalanced database with a large amount of missing data, where the main goal is to preprocess the data for recovery and select completely filled records for further application of these methods.
Algorithms such as clustering, decision tree, artificial neural networks and Bayesian classifier were compared. The results can be used to ensure that describing the problem and understanding the database are essential steps for a correct comparison of methods in a real problem.
Keywords:missing value recovery, unbalanced data, classification
|
|
|
Read the full article …
|
Citation link: Kanishchev I. S. Recovering missing values in classification tasks with data imbalances // Современная наука: актуальные проблемы теории и практики. Серия: Естественные и Технические Науки. -2021. -№05. -С. 63-66 DOI 10.37882/2223-2966.2021.05.13 |
|
|