A Set of Experiments to Consider Data Quality Criteria in Classification Techniques for Data Mining

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

6 Citas (Scopus)

Resumen

A successful data mining process depends on the data quality of the sources in order to obtain reliable knowledge. Therefore, preprocessing data is required for dealing with data quality criteria. However, preprocessing data has been traditionally seen as a time-consuming and non-trivial task since data quality criteria have to be considered without any guide about how they affect the data mining process. To overcome this situation, in this paper, we propose to analyze the data mining techniques to know the behavior of different data quality criteria on the sources and how they affects the results of the algorithms. To this aim, we have conducted a set of experiments to assess three data quality criteria: completeness, correlation and balance of data. This work is a first step towards considering, in a systematic and structured manner, data quality criteria for supporting and guiding data miners in obtaining reliable knowledge.

Idioma originalInglés
Título de la publicación alojadaComputational Science and Its Applications, ICCSA 2011 - International Conference, Proceedings
Páginas680-694
Número de páginas15
EdiciónPART 2
DOI
EstadoPublicada - 2011
Publicado de forma externa
Evento2011 International Conference on Computational Science and Its Applications, ICCSA 2011 - Santander, Espana
Duración: 20 jun. 201123 jun. 2011

Serie de la publicación

NombreLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NúmeroPART 2
Volumen6783 LNCS
ISSN (versión impresa)0302-9743
ISSN (versión digital)1611-3349

Conferencia

Conferencia2011 International Conference on Computational Science and Its Applications, ICCSA 2011
País/TerritorioEspana
CiudadSantander
Período20/06/1123/06/11

Huella

Profundice en los temas de investigación de 'A Set of Experiments to Consider Data Quality Criteria in Classification Techniques for Data Mining'. En conjunto forman una huella única.

Citar esto