What is data mining?

Exclusive, high-quality data for premium business insights.
Post Reply
Irfanabdulla1111
Posts: 60
Joined: Mon Dec 23, 2024 3:45 am

What is data mining?

Post by Irfanabdulla1111 »

Data mining is the science or methodology that allows data to be exploited with the aim of generating models that make it possible to describe, find patterns, establish groupings, classify, segment or associate products, clients or any other entity that is the object of obtaining knowledge and applying it to new ones, obtaining immediate action responses.

Review of certain concepts
In the analytical world there is a need to make some differentiations regarding certain terms where there is some confusion today.

Each of them offers and requires different things or responds to different needs.

Business Intelligence
Business Intelligence can be understood as the set of methodologies, practices or capabilities aimed at the generation and management of information that enables decision-making by an organization's users.

With the following quote we can interpret the idea that business intelligence aims to promote:

“What is not defined cannot be measured. What cannot be measured cannot be improved. What is not improved always degrades.”

— William Thomson Kelvin

Some technologies that are part of business intelligence are:

Data warehouse.
Reporting.
OLAP analysis.
Dashboards.
Business intelligence can be understood as allowing us to answer questions such as:

What happened?
How did it happen?
How often?
What's the problem?
What should we do?
Therefore, it allows us to understand the past and present situation of the organization.

There are prior treatment processes that allow them to be given greater structure and make them suitable for exploitation.

This is a difference with the databases of operational systems where these can present a less structured form.

This is why ETL (Extract, Transform and Load) processes are responsible for organizing them from the operational systems to their storage in the Data Warehouse .

Today, there are organizations that do not address them and lack the ability to analyze what is happening in their organization and, therefore, what measures to implement.

Business Analytics
In this case, we can understand business analysis as the methodology, technique or systems that, based on the past, allow us to predict future behaviors or events by discovering patterns.

The data used may be structured or semi-structured with greater complexity and the treatment prior to its exploitation will require greater attention.

Big Data
Finally, Big Data provides an environment adapted to deal with the 3 "Vs":

Volume.
Speed.
Variety.
We see each one of them.

1.- Volume

We find storage technologies with sharding capacity or distributed storage management dubai phone number list to accommodate a large volume, or file systems such as HDFS ( Hadoop Distributed File System ) for clustering and application of the MapReduce method.

2.- Speed

As we have previously mentioned, data generation rates are increasingly high and, therefore, we must have an environment with high loading and processing speed.

3.- Variety

Finally, our "protagonists" can be presented in a structured way, where they show elementary atomic forms; unstructured, such as text, images, audios or videos; or finally, they can be found halfway by presenting semi-structured forms.

There is a certain consensus about the existence of a fourth V, Veracity , which would consist of checking and ensuring the truth and trustworthiness of the data we are consuming and which is considered a requirement to obtain adequate information.
Post Reply