Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 3

22 UNIT 3 Computer Applications

DATA M I N I N
Find the answers to these questions in the
G
Data mining is simply filtering through large
following text.
amounts of raw data for useful information that
1 What tool is often used in data mining? gives businesses a competitive edge. This
2 What Al method is used for the following information is made up of meaningful patterns
processes? and trends that are already in the data but were
a. Separate data into subsets and then previously unseen.
analyse the subsets to divide them into The most popular tool used when mining is
further subsets for a number of levels. artificial intelligence (AI). AI technologies try to
b. Continually analyse and compare work the way the human brain works, by making
data until patterns emerge. intelligent guesses, learning by example, and
c. Divide data into groups based on similar using deductive reasoning. Some of the more
features or limited data ranges. popular AI methods used in data mining include
3 What term is used for the patterns found neural networks, clustering, and decision trees.
by Neural networks look at the rules of using data,
neuralare
4 When networks?
clusters used in data mining? which are based on the connections found or on a
5 What types of data storage can be used sample set of data. As a result, the software
in data mining? continually analyses value and compares it to the
6 What can an analyst do to improve the other factors, and it compares these factors
data mining results? repeatedly until it finds patterns emerging. These
7 Name some of the ways in which data mining patterns are known as rules. The software then
is currently used. looks for other patterns based on these rules or
sends out an alarm when a trigger value is hit.
Clustering divides data into groups based on
similar features or limited data ranges. Clusters
are used when data isn't labelled in a way that is
favourable to mining. For instance, an insurance
company that wants to find instances of fraud
wouldn't have its records labelled as fraudulent
or not fraudulent. But after analysing patterns
within clusters, the mining software can start to
figure out the rules that point to which claims are
likely to be false.
Decision trees, like clusters, separate the data into
subsets and then analyse the subsets to divide
them into further subsets, and so on (for a few
more levels). The final subsets are then small
enough that the mining process can find
interesting patterns and relationships within the
data.
Once the data to be mined is identified, it should
be cleansed. Cleansing data frees it from duplicate
information and erroneous data. Next, the data
should be stored in a uniform format within
relevant categories or fields. Mining tools can
work with all types of data storage, from large data
warehouses to smaller desktop databases to flat
files. Data warehouses and data
UNIT 3 Computer Applications 23

Re- read the text to find


the
You must first have
data to mine. Data 1 Match the terms in Table A
stores include one with the statements in Table B.
or several
databases or data
warehouses. Table A
a Data mining c. Cleansed data
Data must be stored
in a consistent
b Al d. Data warehouse
format and free
from errors and Table B
redundancies.
i. Storage method of archiving large
Actual mining amounts of data to make it easy to access
occurs when data is
combed for
ii. Data free from duplicate and
patterns and trends. erroneous information
Rules for patterns
arenoted.
iii. A process of filtering through large
amounts of raw data for useful
Someone must
information
analyse mining iv. A computing tool that tries to operate in
results for validity
and relevance.
a way similar to the human brain
2 Mark the following as True or False:
The mining results a Data mining is a process of analysing known
can then be
reviewed and patterns in data.
interpreted,and a
plan of action
b Artificial intelligence is commonly used in
determined. data mining.
c. In data mining, patterns found while
analysing data are used for further
marts are storage methods that involve archiving analysing the data.
large amounts of data in a way that makes it easy d. Data mining is used to detect false insurance
to access when necessary. claims.
When the process is complete, the mining e. Data mining is only useful for a limited range
software generates a report. An analyst goes over of problems.
the report to see if further work needs to be done, 3 Complete the following description of the
such as refining parameters, using other data data mining process using words from the text:
analysis tools to examine the data, or even Large amounts of data stored in data
scrapping the data if it's unusable. If no further are often used for data . The data is
work is required, the report proceeds to the first to remove information
decision makers for appropriate action.
and errors. The is then analysed
The power of data mining is being used for many a tool such using
purposes, such as analysing Supreme Court analysis
as report is then analysed by An
decisions, discovering patterns in health care, an decides if the
who need to be
pulling stories about competitors from newswires, other data refined, tools need to be
resolving bottlenecks in production processes, and the results need to beused, or if because
discarded
analysing sequences in the human genetic they
are The analyst passes the final
makeup. There really is no limit to the type of results to makers who decide on
business or area of study where data mining can
the the action.
be beneficial.

[Adapted from 'Data Mining for Golden Opportunities', Smart


Computing Guide Series Volume 8 Issue 1, January 2000]
Answers

B1:
a- iii b- iv c- ii d- i
B2:
a- False
b- True
c- True
d-  True
e-B1:
False

a- iii b- iv c- ii
d- i
B2:
a- False
b- True
c- True
d- True
e- False
 

You might also like