TY - GEN
T1 - Towards a reverse engineering approach for guiding user in applying data mining?
AU - Espinosa, Roberto
AU - Mazón, Jose Norberto
AU - Zubcoff, José
PY - 2011
Y1 - 2011
N2 - Data mining is at the core of the knowledge discovery process. However, an initial preprocessing step is crucial for assuring reliable results within this process. Preprocessing of data is a time-consuming and non-trivial task since data quality issues should be considered. This is even worst when dealing with complex data, not only because of the different kind of complex data types (XML, multimedia, and so on), but also because of the high dimensionality of complex data. Therefore, to overcome this situation, in this position paper we propose using mechanisms based on data reverse engineering for automatically measuring some data quality criteria on the data sources. These measures will guide user in selecting the most adequate data mining algorithm in the early stages of the knowledge discovery process. Finally, it is worth noting that this work is a first step towards considering, in a systematic and structured manner, data quality criteria for supporting data miners in applying those algorithms that obtain the most reliable knowledge from the available data sources. 2011
AB - Data mining is at the core of the knowledge discovery process. However, an initial preprocessing step is crucial for assuring reliable results within this process. Preprocessing of data is a time-consuming and non-trivial task since data quality issues should be considered. This is even worst when dealing with complex data, not only because of the different kind of complex data types (XML, multimedia, and so on), but also because of the high dimensionality of complex data. Therefore, to overcome this situation, in this position paper we propose using mechanisms based on data reverse engineering for automatically measuring some data quality criteria on the data sources. These measures will guide user in selecting the most adequate data mining algorithm in the early stages of the knowledge discovery process. Finally, it is worth noting that this work is a first step towards considering, in a systematic and structured manner, data quality criteria for supporting data miners in applying those algorithms that obtain the most reliable knowledge from the available data sources. 2011
UR - https://www.scopus.com/pages/publications/84873900202
M3 - Conference contribution
AN - SCOPUS:84873900202
SN - 9788497494861
T3 - Actas de las 16th Jornadas de Ingenieria del Software y Bases de Datos, JISBD 2011
SP - 23
EP - 28
BT - Actas de las 16th Jornadas de Ingenieria del Software y Bases de Datos, JISBD 2011
T2 - 16th Conference on Software Engineering and Databases, JISBD 2011
Y2 - 5 September 2011 through 7 September 2011
ER -