Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Data analysis:

1. Q: What is the difference between univariate and multivariate analysis?

A: Univariate analysis focuses on analyzing a single variable at a time, while multivariate analysis
examines the relationships between multiple variables simultaneously to uncover more complex
patterns and insights.

2. Q: What is the purpose of data normalization in data analysis?

A: Data normalization is used to transform variables into a common scale, allowing for fair comparisons
and preventing variables with larger magnitudes from dominating the analysis. It is particularly useful
when variables have different units or scales.

3. Q: What is the role of data sampling in data analysis?

A: Data sampling involves selecting a subset of data from a larger population. It is often done to save
computational resources, reduce analysis time, or obtain representative insights from a smaller sample
without analyzing the entire dataset.

4. Q: What is the difference between supervised and unsupervised learning in machine learning?

A: Supervised learning involves training a model using labeled data to make predictions or
classifications, while unsupervised learning aims to find patterns or structures in unlabeled data without
explicit target variables.

5. Q: What is the concept of overfitting in machine learning?

A: Overfitting occurs when a machine learning model becomes too complex and starts to memorize the
training data instead of learning generalizable patterns. This can lead to poor performance when the
model is applied to new, unseen data.

6. Q: What is the importance of feature selection in machine learning?

A: Feature selection involves identifying the most relevant and informative variables in a dataset for
building accurate machine learning models. It helps reduce dimensionality, improve model
interpretability, and prevent overfitting.

7. Q: What is the difference between data mining and data analysis?


A: Data mining refers to the process of discovering patterns, relationships, or insights from large
datasets using automated methods, while data analysis encompasses a broader range of techniques and
approaches for examining and interpreting data.

8. Q: What is time series analysis?

A: Time series analysis involves studying data collected over time to understand patterns, trends, and
dependencies. It is commonly used in forecasting future values based on historical patterns or detecting
anomalies in temporal data.

9. Q: What are some common data preprocessing techniques in data analysis?

A: Common data preprocessing techniques include handling missing data, handling outliers,
transforming variables, encoding categorical variables, and scaling or standardizing variables to ensure
compatibility and improve analysis results.

10. Q: What is data-driven decision-making?

A: Data-driven decision-making is an approach that relies on analyzing and interpreting data to guide
and support the decision-making process. It emphasizes the use of evidence-based insights rather than
solely relying on intuition or subjective judgment.

You might also like