Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

CAPSTONE

Stem-based Research
Assessing the Validity of Data
DATA VALIDATION is important because succeeding discussions relevant to
synthesized information including formulation o conclusions and
recommendations heavily depend on it.
Data Analysis
• Systematic process of taking out information from raw data to assist in the
interpretation and discussion of information.
• Performed in a way that data would be organized for effective derivation of
explanations of the information being presented.
Data Analysis
Generally, the analysis of data entails the following:
 Comparing data with existing information as derived from previous studies.
 Testing the consistency of data via repeated measurements to establish accuracy and precision.
 Assessing the robustness (reproducibility) of the methodologies adopted for the data collection.
 Determining the bias or error of methods used.
 Evaluating the choice of method (selectivity and sensitivity) on the bias of potential contribution of the
results or data generated.
 Examining the limitations and the various possibilities that the extracted information could offer.
 Correlating results representing the variables tested.
Data Analysis
The validity of data may be gauged by the extent of the following:
 Scope and coverage of the data-collection process.
 Similarities and inconsistencies of data with results generated from previous studies.
 Environmental and circumstantial concerns.
 Compliance to ethical standards.
 Relevance and originality of data.
 Usability and accessibility of data.
STATISTICAL TOOLS VALUABLE TO
DATA EXAMINATION
• Any data will not have significance unless the uncertainly associated with each
measurement from which the data were obtained is established. (Miller, 1988)
Statistical methods enable the verification of the uncertainly accompanying
each measurement made during data collection.
STATISTICAL TOOLS VALUABLE TO
DATA EXAMINATION
Replicates
- the number of individual samples taken for analysis with the same size and
treated the same manners
Mean
•  The average of the measurements made; it is calculated by dividing the sum of
all replicates measurement:
Sample mean = or Sample mean =
• The symbol represents the summation of the results.
n and , is the individual data of a given data set derived from
several replicates.
Median
• Is the middle value in a group o values when all the data are arranged in either
decreasing or increasing manner.
• Usually reported instead of the mean in case the data set contains an outlier.
STATISTICAL TOOLS VALUABLE TO
DATA EXAMINATION
• Also need to detect and qualify the errors that went with the conduct of
measurements, which establish precision and accuracy.
• Precision
- characterized by standard deviation, variance, and coefficient of
variation (also called the percent relative to standard deviation).
- described the closeness of analytical data with one another.
Standard Deviation
•  s, in a small data set can be calculated using the equation

where:
(n-1) is the number of degrees of freedom
( signifies the deviation of the data from the mean
Variance
•  , is the square of the standard deviation, and the percent relative standard
deviation (RSD) or coefficient of variation (CV) is calculated as follows:
• and
STATISTICAL TOOLS VALUABLE TO
DATA EXAMINATION
• TRUE VALUE
- Can come from measurements involving a CRM as its composition has been certified and is
indicated in a certificate of analysis.
- To quantify accuracy, the relative error is computed as follows:
Relative error =│experimental value-true value │/ true value x 100%

• Applied to help researchers with the decision making should a suspected outlier is
achieved.
• An outlier is a value that appears to be excessively different from the rest of the data set.
Dixon’s Q Test
•  GAP
-Is employed to compare the ratio of difference between the suspected
outlier and the data nearest in value with the suspected outlier.
• RANGE
-Is the difference between the highest and the lowest data in the data set.
Statistical Methods
•  CONFIDENCE LEVEL
- indicates the probability at which the calculated mean really lies within the
specific interval.
CL for population mean,
where t is the student’s t.
Statistical Methods
• HYPOTHESIS
- formulated to warrant a deeper investigation on a particular subject matter.

Data are gathered to find pieces of evidence that will support the hypothesis
Testing starts with the assumption that two data
sets are the same, as stated as null hypothesis.
Testing includes the following:
• Comparing the mean of the data set to a known or true value.
• Comparing the two means via a t-test or paired t-test for the purpose of
a. Determining whether the difference in the two means is due to the presence of random errors
or not.
b. Determining whether the two analytical methods give the same results or not
c. Determining whether tow researchers employing the same data collection strategies give the
same mean or not
• Comparing the respective standard deviations of two populations means via an F- test.
• Analysis of variance (ANOVA)
ANOVA
The ANOVA can be applied to test the following:
• There is no significant difference in the results of the water samples when two
different analytical methods (e.g., Atomic Absorption Spectroscopy or AAS and
Inductively Coupled Plasma or ICP) are employed.
• There is no significant difference in the results of the water samples produced by
three different researchers using the same analytical techniques.
An ANOVA table
Source of Degrees of Mean Square
Variation Sum of Squares Freedom Mean Square Estimates F-value
Sum
Sum of
of squares
squares Estimate
Estimate of
of the
the
Sum
Sum of
of square
square due
due to
to due
due to error
to error //
Within
Within group
group variance
variance due
due to
to --
error
error degrees
degrees of
of error
error
freedom
freedom
Estimate of
Sum of squares Estimate
variance dueofto
Sum Estimate of the
Sum of square due to due to of squares /
treatment Estimate due
of the
variance due/ to
treatment
Between group Sum of square due to duedegrees
to treatment variance to
Between group treatment of / variance due to
treatmentof/
estimate
treatment degrees treatment
freedomof treatment
estimate
variance dueof to
freedom variance
errordue to
error
Summation of the sum
of squaresofdue
Summation thetosum Degrees of
Total - - -
treatment and the
of squares due tosum freedomof
Degrees
Total of squares duethe
to error - - -
treatment and sum freedom
of squares due to error
Presentation of Results
• DATA VISUALIZATION TECHNIQUE
- the technique employs the use of graphics in checking the relationships
that could possibly be presented between and within the variables tested.

• INFERENTIAL STATISTICS
-playing as statistics, graphical formats are prepared for the purpose of gaining deeper
understanding of the casual relationships and associations within variables in an illustrative
manner
GRAPHS
-strongly support or negate the claims you make in discussion.
• Data display technique may illustrate the following specific relationships:
 Dependence of some factors on time (temporal deviation) or rate of occurrence of a phenomenon.
 Ranking among variables
 Distribution or homogeneity
 Extent of deviation from known standard
 Correlations
 Spatial relationship
 Frequency of occurrence
Graphs
• Your choice of graphics depends on the nature of the relationship or
association you would like to communicate.
• Use computer programs
Microsoft Word Microsoft Excel Origin
Line Graph
• This illustrates the changes on a variable as a function of time or any
relationships that would indicate trends or patterns.
Bar Graph
• This consist rectangular blocks, the height of which is reflective of the
frequency that it represents.
Histogram
• This is a variation of a bar graph in which a rectangular blocks are written next
to one another without spaces in-between and the rectangular blocks are drawn
horizontally.
Pie Chart
• This resembles a round-shaped cake, the slice of each corresponds to the fraction
or the distribution of the data.
Scattergram or Scatterplot
• This visually demonstrates the changes in one variable as a function of the
change in the other variable.
• Data points are represented usually with dots and that the dots are not connected
with one another.
• Aside from graph, tables are also valuable tool to represent data.
• Data are written in rows and columns forming a tabular format.

Tables can be classified as:


1. Univariate
- table contains only one kind of
information pertinent to only variable.

2. Bivariate
- table gives information
about the two variables.

3. Multivariate
- table contains information
about more than two variables.

You might also like