EDA Module 1

ENGINEERING DATA ANALYSIS
Data Collection is a systematic way of gathering and measuring information on different groups of people. The
data collected can be used on research, testing hypothesis, and other intended purposes.
Definition of Terms:
Data - collection of facts.
Population - the entire pool from which a statistical sample is drawn
TYPES OF DATA
1. Quantitative - sets of data in numerical form, can be either counted or measured.
Discrete Data - data that can be "counted" (e.g. No. of Pencils, No. of People)
Continuous Data - data can be "measured" (e.g. Height, Weight, Temperature)
2. Qualitative - sets of data that is more on characteristics and classification
Binary Data - falls under two mutually exclusive categories (e.g. right/wrong, true/false)
Nominal Data -named categories with no specific rank or order (e.g. blue/red/green)
Ordinal Data - categories with specific rank or natural order (e.g. short, medium, tall)
Sampling is the selection of a subset of individuals from within a statistical population to estimate characteristics of the
whole population.
Experimentation is the collection of data in a more controlled manner. One example is the data you collected as a result
of your laboratory experiments. Kindly note that the experimentation process is not limited inside a laboratory. Most of
the companies use experimentation to test their hypothesis.
Statistics
Science of gathering, analyzing, interpreting, and presenting data. It helps us make decision and draw conclusions in the
presence of variability.
Statistical analysis is used in order to gain an understanding of a larger population by analyzing the information of a
sample. Statistical analysis allows inferences to be drawn about target markets, consumer cohorts and the general
population by expanding findings appropriately to predict the behavior and characteristics of the many based on the few.
Data analysis is the process of inspecting, presenting and reporting data in a way that is useful to non-technical people.
Because data is next to useless if it can’t be understood by the decision-makers who need to use it, data analysts act as
translators between the numbers and figures and the people who need to know about them.
Data is used in statistical analysis as it can be combined from various sources in order to assist the process of statistical
analysis.
Data Analysis Process Steps
•Data Collection -is the process of gathering and measuring information on targeted variables in an established system
•Data Preparation - is the act of manipulating raw data into a form that can readily and accurately be analyzed
•Data Exploration - is the first step of data analysis used to explore and visualize data to uncover insights from the start
or identify areas or patterns to dig into more.
•Data Modelling - is the process of creating a data model for the data to be stored in a database. This data model is a
conceptual representation of Data objects, the associations between different data objects, and the rules.
•Result Interpretation -is the process of reviewing data through some predefined processes which will help assign some
meaning to the data and arrive at a relevant conclusion.
Population — the whole: a collection of persons, objects, or items under study

Sample — a portion of the whole: a subset of the population
Census — process of gathering data from the entire population for a measurement of interest.
3 BASIC METHODS OF COLLECTING DATA
Retrospective Study – uses historical data. It would use either all or a sample of the historical
process data archived over some period of time.
Observational Study - observes the process or population, disturbing it as little as possible, and
records the quantities of interest.
•In this type of study, the researcher don’t assign choices, he/she just simply observed them
•The application and manipulation of conditions to see effect on outcome ARE NOT possible or not ethical
Designed Experiment - makes deliberate or purposeful changes in the controllable variables of
the system or process, observes the resulting system output data, and then makes an inference or decision about which
variables are responsible for the observed changes in output performance.
A random experiment results in different outcomes, even though it is repeated in the same manner every time, is called
a random experiment.
To model and analyze a random experiment, we must understand the set of possible outcomes from the experiment.
The set of all possible outcomes of a random experiment is called the sample space of the experiment. The sample space
is denoted as S.
Discrete if it consists of finite or countable infinite set of outcomes
Continuous if it contains an interval (either finite or infinite) of real numbers.
Union of two events is the event that consists of all outcomes that are contained in either of the two events
Intersection of the two events is the event that consists of all outcomes that are contained in both of the two events
Complement of an event is the set of outcomes in the sample space that are in the event.
A permutation of the elements is an ordered sequence of elements.
Circular Permutation is the total number of ways in which n distinct objects can be arranged around a fix circle.
Combinations are the number of subsets of r elements that can be selected from a set of n elements.

EDA Module 1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EDA Module 1

Uploaded by

Copyright:

Available Formats

ENGINEERING DATA ANALYSIS

Population — the whole: a collection of persons, objects, or items under study

You might also like