TY - GEN
T1 - A Set of Experiments to Consider Data Quality Criteria in Classification Techniques for Data Mining
AU - Espinosa, Roberto
AU - Zubcoff, José
AU - Mazón, Jose Norberto
PY - 2011
Y1 - 2011
N2 - A successful data mining process depends on the data quality of the sources in order to obtain reliable knowledge. Therefore, preprocessing data is required for dealing with data quality criteria. However, preprocessing data has been traditionally seen as a time-consuming and non-trivial task since data quality criteria have to be considered without any guide about how they affect the data mining process. To overcome this situation, in this paper, we propose to analyze the data mining techniques to know the behavior of different data quality criteria on the sources and how they affects the results of the algorithms. To this aim, we have conducted a set of experiments to assess three data quality criteria: completeness, correlation and balance of data. This work is a first step towards considering, in a systematic and structured manner, data quality criteria for supporting and guiding data miners in obtaining reliable knowledge.
AB - A successful data mining process depends on the data quality of the sources in order to obtain reliable knowledge. Therefore, preprocessing data is required for dealing with data quality criteria. However, preprocessing data has been traditionally seen as a time-consuming and non-trivial task since data quality criteria have to be considered without any guide about how they affect the data mining process. To overcome this situation, in this paper, we propose to analyze the data mining techniques to know the behavior of different data quality criteria on the sources and how they affects the results of the algorithms. To this aim, we have conducted a set of experiments to assess three data quality criteria: completeness, correlation and balance of data. This work is a first step towards considering, in a systematic and structured manner, data quality criteria for supporting and guiding data miners in obtaining reliable knowledge.
UR - https://www.scopus.com/pages/publications/79960325851
U2 - 10.1007/978-3-642-21887-3_51
DO - 10.1007/978-3-642-21887-3_51
M3 - Conference contribution
AN - SCOPUS:79960325851
SN - 9783642218866
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 680
EP - 694
BT - Computational Science and Its Applications, ICCSA 2011 - International Conference, Proceedings
T2 - 2011 International Conference on Computational Science and Its Applications, ICCSA 2011
Y2 - 20 June 2011 through 23 June 2011
ER -