Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

ST402 – Statistical Data Mining

Assignment 01

1. Define what is Data Mining.

Data mining is a process for turn raw data into useful information. It can be done using by some special
software. Data mining depends on effective data collection, warehousing, and computer processing.

2. What are the different names used for Data Mining?

Data mining is also known as Knowledge Discovery in Data (KDD).

 data retrieval

 data analytics

 data extracting

 data analysis

3. What kind of job/research opportunities available in the field of Data Mining? and what is the salary
range for these opportunities.

Job Salary

Data Manager $(111,250 – 186,000)

Data Architect $(119,750 – 193,550)

Big Data Engineer $(130,000 – 222,000)

Data Scientist $(105,750 – 180,250)

Data Analyst $(83,750 – 142,250)

4. What are the skills expected from a competent person for a Data Mining job/research opportunity?
Separately list down knowledge expected in theoretical areas and practical knowledge in tools.

1. Computer Science Skills

 Programming/statistics language: R, Python, C++, Java, Matlab, SQL, SAS, shell etc.

 Big data processing frameworks: Hadoop, Storm, Samza, Spark, Flink

 Operating System: Linux

 Database knowledge: Relational Databases & Non-Relational Databases

1. Statistics & Algorithm Skills

 Basic Statistics Knowledge: Probability, Probability Distribution, Correlation, Regression, Linear

Algebra, Stochastic Process…

 Data Structure & Algorithms

 Machine Learning/Deep Learning Algorithm

 Natural Language Processing

2. Others

 Project Experience

 Communication & Presentation Skills

6. Give 3 examples about data mining applications.

1. Data Mining Examples in Finance

 Loan Payment Prediction

 Targeted Marketing

 Detect Financial Crimes

2. Applications of Data Mining In Marketing

 Forecasting Market

 Anomaly Detection

 System Security
3. Examples Of Data Mining Applications In Healthcare

 Healthcare Management

 Effective Treatments

 Fraudulent and Abusive Data

7. Briefly explain two data mining processes and why they are important in data mining application
 Data Mining Process in Oracle DBMS

RDBMS represents information within the kind of tables with rows and columns. Information is
accessed by writing database queries.

Relational Database management systems like Oracle support data processing using CRISP-DM.
The facilities of the Oracle information square measure helpful in information preparation and
understanding. Oracle supports data processing through java interface, PL/SQL interface,
machine-controlled data processing, SQL functions, and graphical user interfaces.

You might also like