Data Mining Definition
A data mining definition that is generally accepted by the wide variety of big data communities is hard to find. One of the reasons for not having a clear precise definition is that often methods and techniques are overlapping with other fields like machine learning, signal processing, statistics, or more general artificial intelligence. Applications of the wide variety of machine learning methods to large datasets can be named as data mining. The analogy to real mining is that a large volume of earth and raw materials are extracted from a real mine. This large quantities of raw material then gets processed and leads to a much smaller amount of precious materials.
Similarly data mining is defined for processing large quantities of data sets in order to create a simple yet valuable data model with some valuable use. In context of machine learning applications often this valuable use is a high predictive accuracy. This in turn however points to a significant terminology issue meaning that the result of data mining is rather a model and not only plain extracted data. In other words the data needs to be often transformed into an understandable or more suitable data structure for further use like a specific machine learning model. Data extraction on the other hand is rather considered to be a part of feature selection and feature extraction and not being a full model yet. The statements above point again to the highly intertwined nature of data mining with other fields like machine learning making it hard to define it.
Data Mining Definition Details
The following video provides more insights into the topic: