18th International Database Engineering & Applications Symposium

IDEAS '14 would be hosted by the Instituto Superior de Engenharia do Porto: ISEP

Porto, Portugal
July 7-9, 2014



Track: Data Preparation for Data Mining


Current technological developments allow the collection of huge amounts of data that can be used to support decision-making processes. However, this is only possible if data can be transformed into knowledge.


Various kind of data mining algorithms are used to extract data patterns.  Tasks for pattern extraction include classification (rules or trees), regression, clustering, association, sequence modeling, dependency, and so forth. However, much work in the field of data mining was built on the existence of data with quality, and real-world data is often incomplete, noisy, or inconsistent, representing an obstacle for efficient data analysis/mining. Other challenges include big data (number of features/examples, efficiency, parallel processing), curse of dimensionality, or the use of domain knowledge. Although most mining algorithms have some procedures for dealing with dirty data, they lack for robustness. Furthermore, low-quality data will lead to low-quality data analysis/mining results (Garbage in, garbage out). Data preparation techniques, when applied before mining, can substantially improve the overall quality of the data and consequently improve the mining results and/or the time required for the actual mining process. Thus, the development of data preparation techniques is both a challenging and a critical task. This special session on Data Preparation for Data Mining will address practical techniques and methodologies of data preparation for data-mining applications.



  • Data collecting
  • Data integration
  • Data reduction
  • Data cleaning
  • Detection of outliers
  • Data/Information quality
  • Data profiling
  • Data enrichment
  • Feature selection and transformation
  • Data summarization
  • Data discretization
  • Data encoding
  • Sampling
  • Data preparation on regression/classification
  • Data preparation on segmentation/clustering
  • Data preparation on association rules
  • Data preparation on text mining
  • Data preparation on web mining
  • Data preparation on visual data mining
  • Data preparation on temporal and spatial data mining
  • Data preparation on multimedia mining (audio/video)























Submission and Important Dates

Deadlines and paper submission instructions are as in the main page - Please use the appropriate track(session developer) when submitting a paper. Please read the guides and look at the samples given in the right margins


Conference Publication

The accepted papers from this track would be included in the IDEAS14 proceedings and will be published by ACM; the ISBN assigned by ACM to IDEAS14 is: 978-1-4503-2627-8
A version of the proceedings to be distributed to the conference attendees would be prepared by BytePress.


Track Organizing Committee

  • Pedro Henriques, Universidade do Minho
  • Fátima Rodrigues, Instituto Superior de Engenharia do Porto
  • Paulo Oliveira, Instituto Superior de Engenharia do Porto
  • Alberto Freitas, Faculdade de Medicina da Universidade do Porto






Program Committee

  • Ismael Caballero, University of Castilla-La Mancha
  • Cinzia Cappiello, Politecnico di Milano
  • Maribel Yasmina Santos, Universidade do Minho
  • Pedro Henriques, Universidade do Minho
  • Fátima Rodrigues, Instituto Superior de Engenharia do Porto
  • Paulo Oliveira, Instituto Superior de Engenharia do Porto
  • Alberto Freitas, Universidade do Porto
  • Pedro Pereira Rodrigues, Universidade do Porto












Email contact

Author/User FAQ
Author Help
Grace Period
Embeding Fonts


Copyright Form
Latex article.cls
Paper Review: tex
Paper Review: pdf
ACM class
ACM Latex file
ACM bib file
Paper Final: tex
Paper Final: A4
Paper Final: US Ltr


Keio University
The Database Society of Japan