Math 212 Module 1

Anastacio G. Pantaleon, Jr.
– Engineering Department, Surigao del Sur State University – Bislig Campus
Module 1: What is data?

The objective of this module is to understand the following:
1. Definition of Data
2. Kinds of Data
3. Variability
4. Populations and Samples
5. Importance of Reliability
6. Metrology
7. Computer-Assisted Statistical Analyses
Definition of Data
The term "data" refers to items that are known or presumed facts and numbers that may be used to
conclude. Data, in general, is unprocessed information that can be both qualitative and quantitative. The
source might range from gossip to the culmination of detailed inquiry and investigation. Reporting words
can be descriptive, numerical, or a mix of the two. The hierarchal sequence may be used to describe the
move from data to knowledge in the figure below.
In most cases, data must be converted into information by some analysis. Typically, a model is needed to
analyze numerical data to give knowledge about a certain topic of interest. Data may also be collected,
evaluated, and used to put a model of an issue to the test. Data is frequently collected to serve as a
foundation for decision-making or to back up a choice that has already been taken. Unbiased evidence is
required for an objective judgment, but this should never be assumed. A method employed for the latter
purpose may be more biased than one used for the former, in the sense that the collecting, accumulation,
or production process may be prejudiced, ignoring other alternative bits of information. Bias can be
unintentional or deliberate. Intentional bias can be caused by presumptions and even prior false facts,
which may be justified.
To the degree feasible, data providers must offer all important information that may influence its usage.
They are frequently in the best position to give such background knowledge, and they may be the sole
source of such information. No data should be utilized for anything until its trustworthiness has been
established. Unanalyzed data is virtually worthless, no matter how appealing it may appear, and the urge
to utilize it should be avoided. Users of data must be able to assess any data they consume or rely on
reputable sources to provide them with such information.
Kinds of data
Some data is classed as "soft," which is generally qualitative and frequently uses words as the primary
way of transmitting information in the form of labels, descriptors, or category assignments. Although the
findings of opinion surveys can be expressed quantitatively, they give soft data. Although numerical data
can be categorized as "hard" data, it should be noted that, as previously said, it can also have a soft
underbelly.
Module 1 of Math 212 - Engineering Data Analysis Page 1 of 5

Anastacio G. Pantaleon, Jr. – Engineering Department, Surigao del Sur State University – Bislig Campus
Natural Data
Natural data is defined as data that describes natural events, as opposed to data obtained by
experimental. Natural phenomenon observations have formed the foundation for scientific theory and
principles, and the desire to collect better and more precise observations has spurred advancements in
scientific apparatus and methods. Physical science owes its origins to natural science, which sparked the
creation of statistics to comprehend nature's variability better. Experimental investigations of natural
processes sparked the creation of the discipline of experimental design and planning. The distinction
between physical and natural science is no longer discernible. The latter increasingly employs a wide range
of physical measurement techniques, which are amenable to the previously outlined data assessment
methodologies.
Studies of environmental problems can be compared to natural occurrences since the observer is largely
a passive participant. On the other hand, the observer has control over the sampling aspects and should
use it wisely to acquire useful data.
Experimental Data
Experimental data result from a measurement process in which some property is measured for
characterization purposes. The data obtained consist of numbers that often provide a basis for a decision.
This can range from discarding the data, modifying it by excluding some point or points, or using it alone
or in connection with other data in a decision process.
1. Counting Data and Enumeration. Counting results make up some of the data. The number
obtained is accurate if there are no errors. As a consequence, multiple witnesses are likely to
arrive at the same conclusion. Exceptions occur when a decision is made about what to count and
what makes a legitimate event or object to be counted. Any counting procedure should be
referred to as enumeration rather than measuring. Counting radioactive disintegrations is a
unique and frequently used counting technique. The tallied events (for example, disintegrations)
are based on statistical concepts that are well-known and used by practitioners. Therefore they
will not be addressed here. Experimenting with geometric relationships between samples and
counters and the efficiency of detectors can have an impact on the findings. These, in combination
with sampling, bring variability and sources of bias into the data in the same way that other forms
of measurement do and may thus be assessed.
2. Discrete Data. Discrete data defines numbers that have a narrow range of potential values, with
just a few specific values occurring inside that range. As a result, a die's faces can be numbered
from one to six, and no other value can be recorded when a certain face occurs. Mathematical
processes or measurements can produce numerical quantities. The former is subject to the laws
of significant figures, whereas the latter is subject to statistical significance. For computational or
tabulation reasons, discrete quantities such as trigonometric functions, logarithms, and the value
of can be rounded off to any number of figures. The uncertainty in such figures is attributable only
to rounding, and it is not the same as measurement uncertainty. In computations, discrete
numbers should be utilized, rounded under the experimental data to which they correspond, so
that rounding does not add substantial inaccuracy in the computed result.
3. Continuous Data. Continuous data is generally provided via measurement methods. The final digit
observed is due to observational constraints rather than rounding in the word's traditional
meaning. It is theoretically conceivable to have a weight of 1.000050...0 grams, although it is
unlikely. Due to measurement error and rounding, a result of 1.000050 may be questionable in
the final position. By definition, the value of the kilogram (the world's standard of mass) kept at
the International Bureau in Paris is 1.000...0 kg; all other mass standards will have a value given
to them that is unknown.

Variability
In a measuring procedure, variability is unavoidable. It is to be expected that each time it is applied to a
measuring scenario, it will provide a slightly different number or collection of numbers. The means of
groups of numbers will differ, but only to a smaller extent than individual values.
Natural variability and instability must be distinguished. Gross instability can result from a variety of
factors, including a lack of process control. Variability can be introduced by failing to regulate actions that
generate bias. As a result, any variation in calibration, which is done to reduce bias, might result in
variation in measured outcomes—the intentional attempt to control sources of bias and variability results
in a successful measuring procedure. Measurement procedures have been shown to improve substantially
with attentive and systematic effort. Negligence and intermittent attention to detail, on the other hand,
can lead to a loss of precision and accuracy. Measurement must consider practical concerns, resulting in
precision and accuracy that is just "good enough" owing to cost-benefit considerations in all but the most
extreme circumstances. Chemical analysis has advanced to a higher level of precision and accuracy and
the related performance characteristics of selectivity, sensitivity, and detection.
The inevitability of unpredictability makes data interpretation and use more difficult. Many applications
need data quality, which can be challenging to obtain. Every measuring scenario necessitates the use of
minimal quality criteria (sometimes called data quality objectives). These criteria should be set in advance,
and both the producer and the consumer should tell if they have been satisfied. This can be done by
achieving statistical control of the measuring process and employing correct statistical techniques in data
processing.
Population and Samples

When looking at measurement data, it's important to understand the concepts and know the difference
between (1) a population and (2) a sample.
➢ The phrase "population" refers to the entire item, material, or region under examination or whose
characteristics must be established.
➢ A sample is a subset of a population.
It may not be able to study the entire population unless it is basic and tiny. In this situation, measurements
are frequently taken on samples that are thought to represent the target population.
Due to the diversity of the population and all parts of the process of taking a sample from it, measurement
results might be variable. Biases can also occur for the same reasons. In addition to the uncertainty of the
measurement method itself, both types of sample-related uncertainty – variability and bias – can be found
in measurement data. Each type of uncertainty must be handled appropriately, but this may not be
achievable unless the measurement program has a correct statistical design. In reality, a poorly planned
(or absent) measuring program might make rational data interpretation nearly difficult.
Importance of Reliability
The phrase "reliability" refers to a level of quality that can be recorded, assessed, and trusted. If any one
of these characteristics is lacking in any data, the data's dependability. As a result, the trust that may be
placed in any choices based on the data is lowered. In almost any data scenario, reliability concerns are
critical, but they are more critical when data compilations are generated and data from multiple sources
must be combined. As a result of the latter circumstance, the idea of data compatibility has emerged as a
critical criterion for environmental data. Data compatibility is a complicated notion involving both

statistical quality definition and the sufficiency of all measurement system components, including the
model, the measurement plan, calibration, sampling, and the quality assurance processes used.
Peer assessment of all parts of the system is an important technique for ensuring the accuracy of
measurement data. In complicated, frequently encountered scenarios, no one individual can conceive of
everything that may produce measurement issues. In most situations, peer review throughout the
planning stage will widen the planning base and reduce issues. Critical evaluation at various phases of big
measurement projects can confirm control or indicate impending issues. The selection of appropriate
reviewers is a crucial part of a measurement program's operation. Good reviewers should have both an
in-depth and broad understanding of the subject area for which they are hired.
Reviews should be carried out with the utmost level of objectivity possible. Furthermore, reviewers should
treat the material under review as privileged information. When a reviewer's present work is too similar
to that of the subject under evaluation, conflicts of interest might emerge. It may be better to refrain in
such situations. Supervisory control is a complementary activity to peer review in small projects or tasks.
Peer review of data and conclusions derived from it can improve program dependability and should be
done. For accurate individual measurement findings, supervisory control over data release is required.
Reviews of all sorts and at all levels rely heavily on statistics and statistically based assessments.
Metrology
Metrology is the study of measuring, and it is quickly becoming a recognized discipline in its own right.
Engineering metrology, physical metrology, chemical metrology, and biometrology are all special areas of
metrology. Metrologists and the name of their specialty are terms used to describe those who have
studied and practiced metrology. As a result, physical metrologists are becoming increasingly frequent.
Analytical chemists are more often known, although they can also be referred to as chemical metrologists.
The quest for perfection in measuring as a profession is what distinguishes all metrologists.
Metrologists do research in a variety of methods to enhance the science of measuring. They create
measuring systems, assess their performance, and verify their application in a variety of scenarios.
Metrologists create cost-effective measurement strategies that include methods for evaluating and
assessing data quality. Because metrologists must deal with and comprehend variability, statistics play an
important role in all parts of metrology. As seen in the diagram below, statistics play a critical role in
practical measuring scenarios.
The picture depicts statistical analysis's key role in data analysis, which should be necessary for data
release in every laboratory. When the appropriate types of control data are acquired, statistical analysis

may be used to monitor the measuring system's performance, as shown in the figure by the feedback
loop. Statistical approaches are used to create measurement programs, including the number of samples,
calibration processes, the frequency with which they are used, and the frequency with which control
samples are measured. All of this is covered in quality assurance books like the one written by the current
author.
Computer-Assisted Statistical Analyses

It should be obvious from the preceding discussion that a working knowledge of statistical techniques and
the ability to use them is almost a must for the modern meteorologist. Modern computers can make
statistics easier to use, but a solid knowledge of the concepts is required for their sensible use. When
contemporary computers are available, they should be utilized to the fullest extent possible. Furthermore,
when large amounts of data are collected quickly, computer-assisted data analysis may be the only means
to evaluate a measuring system's performance in real-time and interpret its outputs.
MODULE ACTIVITY EXERCISES

1. Discuss the hierarchy: Data → information → knowledge.
2. Compare “hard” and “soft” data.

Math 212 Module 1

Uploaded by

Copyright:

Available Formats

You might also like

Math 212 Module 1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Math 212 Module 1

Uploaded by

Copyright:

Available Formats

Anastacio G. Pantaleon, Jr.

– Engineering Department, Surigao del Sur State University – Bislig Campus

Module 1: What is data?

Module 1 of Math 212 - Engineering Data Analysis Page 1 of 5

Module 1 of Math 212 - Engineering Data Analysis Page 2 of 5

Population and Samples

Module 1 of Math 212 - Engineering Data Analysis Page 3 of 5

Module 1 of Math 212 - Engineering Data Analysis Page 4 of 5

Computer-Assisted Statistical Analyses

MODULE ACTIVITY EXERCISES

Module 1 of Math 212 - Engineering Data Analysis Page 5 of 5

You might also like