Una breve introduzione alle tecniche di Data Mining

The aim of this publication is to expose students to use basic tools for the analysis of big amount of data. The first section starts presenting the definition of Data Mining and Knowledge Discovery in Database explaining the more common techniques and listing the main operational applications. A second paragraph illustrates the first three phases preceding the application of Data Mining techniques: Selection/Sampling, Pre-processing/Cleaning and Transformation/Reduction of data. These prelaminar data analysis techniques are essential as the results of the Data Mining models depend on the correctness of the data. The third paragraph presents some applications of methodologies. In this section, the technical aspect has less relevance than the operational one with the aim to explain the use of these techniques. However, the more common Data Mining models are listed and explained. The fourth paragraph is addressed to the Text Mining and Web Mining, which are two methodologies used to analyze texts and websites. This section presents the main problems related to textual analysis and the techniques that can be used to obtain effective searches. Finally, two appendices have been added: the Statistical Appendix reports some technical insights that may be useful for understanding the Data Mining systems; in a second appendix, a Short Glossary containing the main terms related to Data Mining used in the text is proposed.


Greta Falavigna


CNR Ircres