Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

The College of Information and

Archdiocese of Tuguegarao
Computing Sciences
LYCEUM OF APARRI BS Computer Science (Level 3 Accredited)
Aparri, Cagayan BS Information Technology

Pretest Examination
IT 414-Data Scalability and Analytics
Name: _____________________________________________ Score: __________________
Course/Year:______________________________________ Items: 60
1. What is true about Data Visualization?
A. Data Visualization is used to communicate information clearly and efficiently to users by the usage of
information graphics such as tables and charts.
B. Data Visualization helps users in analyzing a large amount of data in a simpler way.
C. Data Visualization makes complex data more accessible, understandable, and usable.
D. All of the above
2. Which are pros of data visualization?
A. It can be accessed quickly by a wider audience.
B. It can misrepresent information
C. It can be distracting
D. None Of the above
3. The goal of descriptive analysis is to describe or summarize a set of data.?
4. Which are cons of data visualization?
A. It conveys a lot of information in a small space.
B. It makes your report more visually appealing.
C. visual data is distorted or excessively used.
D. None Of the above
5. Data visualization is also an element of the broader _____________.
A. deliver presentation architecture B. data presentation architecture
C. dataset presentation architecture D. data process architecture
6. Which of the intricate techniques is not used for data visualization?
A. Bullet Graphs B. Bubble Clouds
C. Fever Maps D. Heat Maps
7. Which method shows hierarchical data in a nested format?
A. Treemaps B. Scatter plots
C. Population pyramids D. Area charts
8. Which one of the following is most basic and commonly used techniques?
A. Line charts B. Scatter plots
C. Population pyramids D. Area charts
9. Which is used to inference for 1 proportion using normal approx?
A. fisher.test() B. chisq.test()
C. Lm.test() D. prop.test()
10. Which is used to query and edit graphical settings?
A. anova() B. par()
C. plot() D. cum()
11. Which is used to find the factor congruence coefficients?
A. factor.mosaicplot B. factor.xyplot
C. factor.congruence D. factor.cumsum
12. Which of the following method make vector of repeated values?
A. rep() B. data()
C. view() D. read()
13. Which of the following is tool for checking normality?
A. qqline() B. qline()
C. anova() D. lm()
14. Who calls the lower level functions
A. lm() B. col.max
C. par D. histo
15. Which of the following is false?
A. data visualization include the ability to absorb information quickly
B. Data visualization is another form of visual art
C. Data visualization decrease the insights and take solwer decisions
D. None Of the above
16. Which of the following lists names of variables in a data.frame?
A. par() B. names() C. barchart() D. quantile()
17. Common use cases for data visualization include?
A. Politics B. Sales and marketing C. Healthcare D. All of them
18. Which of the folllowing statement is true?
A. Scientific visualization, sometimes referred to in shorthand as SciVis
B. Healthcare professionals frequently use choropleth maps to visualize important health data.
C. Candlestick charts are used as trading tools and help finance professionals analyze price movements over time
D. All of the above
19. Which of the following plots are often used for checkingrandomness in time series?
A. Autocausation B. Autorank C. Autocorrelation D. None of the them
20. ________is used for density plots?
A. par B. lm C. kde D. C
21. What is true about data mining?
A. Data Mining is defined as the procedure of extracting information from huge sets of data
B. Data mining also involves other processes such as Data Cleaning, Data Integration, Data Transformation
C. Data mining is the procedure of mining knowledge from data.
D. All of the above
22. Which of the following is correct application of data mining?
A. Market Analysis and Management B. Corporate Analysis & Risk Management
C. Fraud Detection D. All of the above
23. How many categories of functions involved in Data Mining?
A. 2 B. 3 C. 4 D. 5
24. In Data Characterization, class under study is called as?
A. Study Class B. Intial Class C. Target Class D. Final Class
25. The mapping or classification of a class with some predefined group or class is known as?
A. Data Characterization B. Data Discrimination C. Data Set D. Data Sub Structure
26. A sequence of patterns that occur frequently is known as?
A. Frequent Item Set B. Frequent Subsequence
C. Frequent Sub Structure D. All of the above
27. The analysis performed to uncover interesting statistical correlations between associated-attribute-value pairs is
A. Mining of Association B. Mining of Clusters
C. Mining of Correlations D. None of the above
28. __________ refers to the description and model regularities or trends for objects whose behavior changes over
A. Outlier Analysis B. Evolution Analysis
C. Prediction D. Classification
29. __________ may be defined as the data objects that do not comply with the general behavior or model of the
data available.
A. Outlier Analysis B. Evolution Analysis
C. Prediction D. Classification
30. Pattern evaluation issue comes under?
A. Mining Methodology and User Interaction Issues B.Performance Issues
C. Diverse Data Types Issues D. None of the above
31. "Efficiency and scalability of data mining algorithms" issues comes under?
A. Mining Methodology and User Interaction Issues B. Performance Issues
C. Diverse Data Types Issues D. None of the above
32. "Handling of relational and complex types of data" issue comes under?
A. Mining Methodology and User Interaction Issues B. Performance Issues
C. Diverse Data Types Issues D. None of the above
33. To integrate heterogeneous databases, how many approaches are there in Data Warehousing?
A. 2 B. 3 C. 4 D. 5
34. Which of the following is correct disadvantage of Query-Driven Approach in Data Warehousing?
A. The Query Driven Approach needs complex integration and filtering processes.
B. It is very inefficient and very expensive for frequent queries.
C. This approach is expensive for queries that require aggregations.
D. All of the above
35. Which of the following is correct advantage of Update-Driven Approach in Data Warehousing?
A. This approach provides high performance.
B. The data can be copied, processed, integrated, annotated, summarized and restructured in the semantic data
store in advance.
C. Both A and B
D. None Of the above
36. The first steps involved in the knowledge discovery is?
A. Data Integration B. Data Selection
C. Data Transformation D. Data Cleaning
37. What is the use of data cleaning?
A. to remove the noisy data B. correct the inconsistencies in data
C. transformations to correct the wrong data. D. All of the above
38. In which step of Knowledge Discovery, multiple data sources are combined?
A. Data Cleaning B. Data Integration
C. Data Selection D. Data Transformation
39. Data Mining System Classification consists of?
A. Database Technology B. Machine Learning
C. Information Science D. All of THEM
40. DMQL stands for?
A. Data Mining Query Language B. Dataset Mining Query Language
C. DBMiner Query Language D. Data Marts Query Language
41. Data Analytics uses ___ to get insights from data.
A. Statistical figures B. Statistical Methods C. Miner D. None of them

42. Amongst which of the following is / are the branch of statistics which deals with the development of statistical
methods is classified as ___.
A. Industry statistics c.Economic statistics
B. Applied statistics D. None of the mentioned above
43. Linear Regression is the supervised machine learning model in which the model finds the best fit ___ between the
independent and dependent variable.
A. Linear line B. Nonlinear line C. Curved line D. All of the mentioned above
44. Amongst which of the following is / are the types of Linear Regression,
A. Simple Linear Regression B. Multiple Linear Regression
B. Both A and B C.None of the mentioned above
45. Amongst which of the following is / are the true about regression analysis?
A. Describes associations within the data B. Modeling relationships within the data
C. Answering yes/no questions about the data D.All of them
46. Linear regression analysis is used to predict the value of a variable based on the value of another variable.
A. True B.False C. yes/no D.All of the mentioned above
47. A Linear Regression model's main aim is to find the best fit linear line and the ___ of intercept and coefficients
such that the error is minimized.
A. Optimal values B.Linear line C.Linear polynomial D.None of theM
48. Error is the difference between the actual value and Predicted value and the goal is to reduce this difference?
A. True B. False C. Linear polynomial D. None of the mentioned above
49. The process of quantifying data is referred to as ___.
A. Decoding B. Structure C .Enumeration D. Coding
50. Text Analytics, also referred to as Text Mining?
A. True B. False C. Enumeration D. Coding
51. Which of the following refers to the problem of finding abstracted patterns (or structures) in the unlabeled data?
a. Supervised learning B. Unsupervised learning C. Hybrid learning D.Reinforcement learning
52) Which one of the following refers to querying the unstructured textual data?
a. Information access B. Information update c.Information retrieval Information manipulation
53) Which of the following can be considered as the correct process of Data Mining?
a. Infrastructure, Exploration, Analysis, Interpretation, Exploitation
b. Exploration, Infrastructure, Analysis, Interpretation, Exploitation
c. Exploration, Infrastructure, Interpretation, Analysis, Exploitation
d. Exploration, Infrastructure, Analysis, Exploitation, Interpretation
54) Which of the following is an essential process in which the intelligent methods are applied to extract data
a. Warehousing B. Data Mining C. Text Mining D. Data Selection
55) What is KDD in data mining?
a. Knowledge Discovery Database B. Knowledge Discovery Data
b. Knowledge Data definition D. Knowledge data house
56) The adaptive system management refers to:
a. Science of making machine performs the task that would require intelligence when performed by humans.
b. A computational procedure that takes some values as input and produces some values as the output.
c. It uses machine learning techniques, in which programs learn from their past experience and adapt themself
to new conditions or situations.
d. All of the above.
57) For what purpose, the analysis tools pre-compute the summaries of the huge amount of data?
a. In order to maintain consistency B. For authentication
C. For data access D. To obtain the queries response
58) What are the functions of Data Mining?
a. Association and correctional analysis classification C. Prediction and characterization
b. Cluster analysis and Evolution analysis D. All of theM
59) In the following given diagram, which type of clustering is used?

a. Hierarchal b.Naive Bayes c.Partitional d.None of them

60) Which of the following statements is incorrect about the hierarchal clustering?
a. The hierarchal type of clustering is also known as the HCA
b. The choice of an appropriate metric can influence the shape of the cluster
c. In general, the splits and merges both are determined in a greedy manner
d. All of the abovE

Prepared by: Charyann A. Gumarang Checked by: JENEL M. SAMORTIN


Approved by: Dr. Evelyn Pascua, Ph.D., CESO III

President for Academics

You might also like