Home »
Data Mining
Data Mining Tasks – Overview
In this article, we are going to learn about data mining tasks and their categories.
By Palkesh Jain Last updated : April 17, 2023
Overview
Data mining functionalities are to perceive the various forms of patterns to be identified in data mining activities. To define the type of patterns to be discovered in data mining activities, data mining features are used. Data mining has a wide application for forecasting and characterizing data in big data.
Data Mining Tasks Categories
Data mining tasks are majorly categorized into two categories: descriptive and predictive.
1. Descriptive Data Mining
Descriptive data mining offers a detailed description of the data, for example- it gives insight into what's going on inside the data without any prior idea. This demonstrates the common characteristics in the results. It includes any information to grasp what's going on in the data without a prior idea.
2. Predictive Data Mining
This allows users to consider features that are not specifically available. For example, the projection of the market analysis in the next quarters with the output of the previous quarters, In general, the predictive analysis forecasts or infers the features of the data previously available. For an instance: judging by the outcomes of medical records of a patient who suffers from some real illness.
Key Data Mining Tasks
1. Characterization and Discrimination
- Data Characterization: The characterization of data is a description of the general characteristics of objects in a target class which creates what are called characteristic rules.
A database query usually computes the data applicable to a user-specified class and runs through a description component to retrieve the meaning of the data at various abstraction levels.
Eg;-Bar maps, curves, and pie charts.
- Data Discrimination: Data discrimination creates a series of rules called discriminate rules that is simply a distinction between the two classes aligned with the goal class and the opposite class of the general characteristics of objects.
2. Prediction
To detect the inaccessible data, it uses regression analysis and detects the missing numeric values in the data. If the classmark is absent, so classification is used to render the prediction. Due to its relevance in business intelligence, the prediction is common. If the classmark is absent, so the prediction is performed using classification. There are two methods of predicting data. Due to its relevance in business intelligence, a prediction is common. The prediction of the classmark using the previously developed class model and the prediction of incomplete or incomplete data using prediction analysis are two ways of predicting data.
3. Classification
Classification is used to create data structures of predefined classes, as the model is used to classify new instances whose classification is not understood. The instances used to produce the model are known as data from preparation. A decision tree or set of classification rules is based on such a form of classification process that can be collected to identify future details, for example by classifying the possible compensation of the employee based on the classification of salaries of related employees in the company.
Read more: Classification and Prediction in Data Mining
4. Association Analysis
The link between the data and the rules that bind them is discovered. And two or more data attributes are associated. It associates qualities that are transacted together regularly. They work out what are called the rules of partnerships that are commonly used in the study of stock baskets. To link the attributes, there are two elements. One is the trust that suggests the possibility of both associated together, and another helps, which informs of associations' past occurrence.
5. Outlier Analysis
Data components that cannot be clustered into a given class or cluster are outliers. They are often referred to as anomalies or surprises and are also very important to remember.
Although in some contexts, outliers can be called noise and discarded, they can disclose useful information in other areas, and hence can be very important and beneficial for their study.
Read more: Outlier Analysis
6. Cluster Analysis
Clustering is the arrangement of data in groups. Unlike classification, however, class labels are undefined in clustering and it is up to the clustering algorithm to find suitable classes. Clustering is often called unsupervised classification since provided class labels do not execute the classification. Many clustering methods are all based on the concept of maximizing the similarity (intra-class similarity) between objects of the same class and decreasing the similarity between objects in different classes (inter-class similarity).
7. Evolution & Deviation Analysis
We may uncover patterns and shifts in actions over time, with such distinct analysis, we can find features such as time-series results, periodicity, and similarities in patterns. Many technologies from space science to retail marketing can be found holistically in data processing and features.