Evaluating Scedasticity Using H-Values

Vol.
8, 2023-16
Evaluating Scedasticity using H-values
Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org
doi: 10.13140/RG.2.2.19965.95200
Abstract
A statistical test of scedasticity indicates, with a given confidence, whether a set of
observations has a constant (homoscedastic) or a variable (heteroscedastic) standard
deviation with respect to any associated reference variable. Many different tests of scedasticity
are available, in part due to the difficulty for unequivocally determining the scedasticity of a
data set, particularly for non-normal and for small samples. In addition, the lack of an objective
criterion for decision (significance level) increases the uncertainty involved in the evaluation. In
this report, a new test of scedasticity is proposed based on the statistical distribution of the R 2
coefficient describing the behavior of the standard deviation of the data, and considering an
optimal significance level that minimizes the total test error. The decision of the test is
determined by a proposed H-value, resulting from the logarithm of the ratio between the P-
value of the test and the optimal significance level. If H>0 then the data is homoscedastic. If
H<0 then the data is heteroscedastic. The performance of the proposed test was found
satisfactory and competitive compared to established tests of scedasticity.
Keywords
H-value, Heteroscedasticity, Homoscedasticity, Monte Carlo, Normal Distribution, Optimization,
R2 Coefficient, Regression Models, Scedasticity, Statistical Tests, Test Error, Variance
1. Introduction
The term homoscedasticity1 was introduced by Karl Pearson in . Pearson defined a

homoscedastic system as a system where “all arrays are equally scattered about their means” [1].
The term derives from the Ancient Greek words homos (same, equal, like) and skedastikos
(dispersed, able to disperse), and is commonly interpreted as “the property of having equal
statistical variances” [2].
1
The word homoskedasticity can be alternatively employed.
Cite as: Hernandez, H. (2023). Evaluating Scedasticity using H-values. ForsChem Research Reports, 8,
2023-16, 1 - 40. Publication Date: 30/11/2023.
Evaluating Scedasticity
using H-values
Hugo Hernandez
ForsChem Research
The opposite term, heteroscedasticity2, refers to systems having an unequal dispersion of data
about their means (unequal variances).
Although not clearly stated in the previous definitions, the scedastic character of a system or
data set depends on the reference variable (or variables) used to classify the data. Thus, we
may say that the system is homoscedastic with respect to a certain reference variable for a
certain range of values, but at the same time it can be heteroscedastic with respect to another
reference variable (or to another range of values of the same reference variable).
In particular, the homoscedasticity of model residuals is an important assumption in most types

of regression analysis [3-5]. Thus, the conclusions obtained by regression analysis will only be
reliable if the homoscedastic assumption (along with all other assumptions [3,6]) has been
validated. The most common approach consists in performing a statistical test of hypotheses
[7] for scedasticity, which can be generalized as follows:
(1.1)
If the confidence in rejecting the null hypothesis (H0: Data is homoscedastic) surpasses the
minimum confidence level ( ) for acceptance of the null hypothesis, then we can confidently
conclude that the data set is homoscedastic. In mathematical terms, this condition is expressed
as follows:
(1.2)
where is the probability value of the test, and , is known as the significance level.
On the other hand, if
(1.3)
we may reject the null hypothesis and conclude that the data is heteroscedastic.
There are, however, two difficulties with this approach:

i. Typically, there are no clear, objective rules for defining the significance level (or the
minimum confidence level) of the test. It has to be arbitrarily determined by the user of
the test.
ii. An additional model is needed for calculating the -value. Such model contains an
assumption about the probability distribution of residuals, and may also require a
function describing the heteroscedasticity of residuals.
2
The term heteroskedasticity can be alternatively employed.
30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

10.13140/RG.2.2.19965.95200 www.forschem.org (2 / 40)
using H-values
Hugo Hernandez
ForsChem Research
The fact that test methods depend on specific assumptions about the data, limits their use as
general, universal methods for testing scedasticity.
In Section 2, the most common methods for testing scedasticity will be briefly reviewed.
Section 3 will introduce a new empirical approach for testing scedasticity (denoted as H-value
method) intended to be valid for any data distribution. Finally in Section 4, a comparative
analysis of the total test error for the proposed method and other established tests is
performed.
2. Tests for Scedasticity
There are several methods available for testing scedasticity [8,9]. Most of them are designed to
test differences in variances amongst different groups of data (heteroscedasticity), but some
are also suitable for testing for homoscedasticity in regression analysis (not necessarily
requiring repeated observations). A non-exhaustive summary of the different types of
scedasticity tests are briefly described in this Section.
2.1. Testing Data Classified in Multiple Groups
The most basic approach for comparing the variance of two groups of measurements ( and
) normally distributed is the -test [10,11]. This test calculates the -statistic as the ratio
between the variances of the groups, as follows [12]:
(2.1)
where and are estimates of the variance of each group, obtained from the observed data.
Then, the probability ( -value) of erroneously rejecting the null hypothesis ( ) is
calculated using Fisher’s distribution, considering the degrees of freedom available for the
estimation of variance of each group. Of course, if the data cannot be grouped into two
different sets, or the data in each set is not normally distributed [13], then the -test cannot be
reliably used.
If the data is grouped in more than two sets, the overall scedasticity can be evaluated by
considering the sets with extreme variances (highest and lowest variance estimations). If no
significant differences are found between extreme variances by means of an F-test, then the
whole set can be considered homoscedastic. This approach is commonly known as Hartley’s
test [14].

using H-values
Hugo Hernandez
ForsChem Research
Since the groups are not always clearly defined, Goldfeld and Quandt [15] proposed splitting
the full set of observations in three groups according to the value of the reference variable
used to evaluate scedasticity: A central group containing an arbitrary number ( ) of
observations, a lower group containing the observations with smallest values of the
reference variable, and an upper group containing the observations with largest values of
the reference variable. Then, an F-test is performed between the lower and upper groups to
evaluate scedasticity. Unfortunately, this method is only able to detect monotonic changes in
variance, and can only be used when the observations in each group are normally distributed.
Levene [16] proposed a robust version of the F-test, estimating the statistic from the analysis
of variance [17] of the following variable transformation:
(| ̅ |)
(2.2)
where represents the -th replicate value (out of replicates) of the -th observation
condition. In addition,
̅ ∑
(2.3)
and ( ) is any function monotonically increasing on ( ). For simplicity, Levene suggested
using either the simplest case ( ) , or the natural logarithm transformation ( ) .
Then, the following statistic is obtained:
∑ ( ∑ ∑ ∑ )
∑ ∑ ( ∑ )
(2.4)
where in this case is the number of groups of observations, and ∑ .
Brown and Forsythe [18] proposed a slight modification of the simplest Levene’s test, where the
transformation considers the median ( ̃ ) instead of the mean ( ̅ ) of the group, as follows:
| ̃|
(2.5)

using H-values
Hugo Hernandez
ForsChem Research
One of the most common methods used nowadays to test scedasticity in groups of data is
Bartlett’s Test [19]. Instead of using the distribution (which is the ratio between two
distributions), Bartlett proposed a single statistic as follows:
( ) ∑ ( )
∑
( )
(2.6)
where is the variance estimated for group , and is a pooled variance determined as
follows:
∑( )
(2.7)
Unfortunately, like all other parametric tests mentioned so far, Bartlett’s test is not reliable for
non-normal data. In addition, with the sole exception of the Goldfend-Quandt test, these
methods can only be used with grouped data, as is the case of most experimental designs.
2.2. Testing Data within a Single Group
This type of tests represents a more general approach for testing scedasticity as the data set
must not necessarily be grouped. However, a model of the heteroscedastic behavior of the
variance is required.
In general, we may state that the variance of a variable can be described with respect to a
certain vector of reference variables by the following equation:
( ) ( )
(2.8)
where ( ) represents any arbitrary function of .
Eq. (2.8) can be alternatively expressed in terms of standard deviation ( ) as follows:
( ) ( ) ( )
(2.9)
Eq. (2.8) can also be expressed in terms of individual observations ( ) considering the
regression model:
̅ ( )
(2.10)

using H-values
Hugo Hernandez
ForsChem Research
where is a random residual error.
On the other hand, Eq. (2.9) can be expressed by the regression model:
| ̅| ( )
(2.11)
If a non-constant model is found statistically significant, then we may conclude that the data is
heteroscedastic with respect to the reference variable(s).
Park [20], for example, considered the following general model for the variance:
( )
(2.12)
where and are model parameters. If a statistical test shows that , then we may
conclude that the data is heteroscedastic.
Glejser [21], considered the general model structure for the standard deviation:
( ) ( )
(2.13)
In this case, the system can be considered heteroscedastic if statistical tests show that ,
, and , simultaneously.
White [22] considered the following multivariate function for the variance:
( ) ∑∑
(2.14)
where is the number of variables in vector . Again, if any is significantly different from
zero, the system is heteroscedastic. Nevertheless, White associated the statistical significance
of the model with the coefficient for the multiple linear regression, indicating that it follows
a distribution:
( )
(2.15)
However, notice that ( ) is a semi-bounded positive distribution [ ), whereas is

limited to the range [ ] in the case of multiple linear regression.
Alternatively, Cook and Weisberg [23] considered the multivariate function:

using H-values
Hugo Hernandez
ForsChem Research
( ) ∏
(2.16)
In this case, heteroscedasticity is obtained when any is significantly different from zero. For
that purpose, Cook and Weisberg also proposed a score test statistic following a
distribution.
Breusch and Pagan [24] also discussed model (2.16) along with the following general structure:
( ) ∏
(2.17)
and concluded that heteroscedasticity can be evaluated from the explained sum of squares of
the following simplified linear model:
̅
∑
(2.18)
where represents the terms remaining in model (2.18), and ̂ is an estimation of
the variance.
The corresponding statistic ( ) behaves as a distribution with degrees of freedom:
̅
( ) ∑
(2.19)
An alternative method for evaluating scedasticity from a general model of standard deviation
( ) was proposed in a previous report [4]. In this case, the coefficient of variation of the
model is evaluated for the range of values considered, and the result is compared to the
coefficient of variation of a Helmert ( ) distribution with identical degrees of freedom
(assuming a normal error). The coefficient of variation of the model is determined as follows:
∫ ( ( ) ∫ ( ) )
√
∫
( )
∫ ( )
∫
(2.20)

using H-values
Hugo Hernandez
ForsChem Research
A simple, approximated empirical expression for calculating the coefficient of variation of

Helmert’s distribution is:
( )
√
(2.21)
The system can be considered heteroscedastic if ( ) ( ). Of course, this is not a

rigorous statistical test, but rather a quick rule of thumb.
3. H-value Test for Scedasticity
It is highly desirable that a method for testing scedasticity:

i. Can be used independently of the number of groups of data considered, even if it is
only a single group.
ii. Can be used for any number of observations.
iii. Can be used for any arbitrary distribution of error, not only for the normal distribution.
iv. Can be used for any arbitrary functionality of the variance (heteroscedastic model).
v. Can be used with respect to any arbitrary reference variable employed to classify the
observations.
vi. Can be used without a subjective error tolerance criterion provided by the analyst (e.g.
significance level).
Considering the first condition, almost all methods described in Section 2.1 must be abandoned,
as they are only valid for multiple groups of observations. Only the parametric Goldfeld-Quandt
test remains, but it is only valid for normal distributions. A non-parametric test was also
proposed by Goldfeld and Quandt [15] based on the distribution of peaks for the absolute
deviation of the variable ordered with respect to the reference variable. Using permutation
theory, it is then possible to estimate the probability of reaching a certain number of peaks for
a homoscedastic system.
Now, most methods described in Section 2.2 are limited to some specific heteroscedastic
model structures. Thus, those tests can determine if a system is heteroscedastic, but they are
not able to guarantee that the system is truly homoscedastic, as a different heteroscedastic
structure might be present in the data. The last method described in Section 2.2 is valid for any
arbitrary heteroscedastic structure, but it is not a rigorous statistical test and only applies to a
normal distribution of the observed variable.
With those six conditions for a general test for scedasticity in mind, a new evaluation method
will be developed in this Section.

using H-values
Hugo Hernandez
ForsChem Research
3.1. Rationale of the Method
First, let us consider that the variance and standard deviation of the observed variable are
described by the general expressions given in Eq. (2.8) and (2.9), respectively.
Then, variable can be denoted as homoscedastic with respect to a certain reference variable
in the range [ ] (included in vector ), when:
( ) ( )
(3.1)
In terms of the general functions ( ) and ( ) we have:
( ) ( )
(3.2)
Eq. (3.1) and (3.2) represent possible mathematical formulations of the null hypothesis
(homoscedasticity).
Now, the condition for heteroscedasticity of with respect to in the range [ ]

will be:
( ) ( )
(3.3)
or equivalently,
( ) ( )
(3.4)
Eq. (3.3) and (3.4) represent possible mathematical formulations of the alternative hypothesis
(heteroscedasticity).
Of course, from a statistical point of view the alternative hypothesis can be accepted only
when the derivative is significantly different from zero for at least a value in the range
considered for the reference variable.
Also, we can see that the same results are obtained considering either the variance function
( ( )) or the standard deviation function ( ( ) ). Thus, let us then continue the derivation of the
test using the standard deviation function only.

using H-values
Hugo Hernandez
ForsChem Research
Any arbitrary function ( ) can be expressed as an infinite power series using the following
Taylor series expansion about any arbitrary reference value [25]:
( ) ( )
( ) ( ) ∑ ∑( )
( )
(3.5)
Of course, an infinite series has no practical value, but it can be approximated by truncating the
series after a certain number of terms ( ):
( ) ( )
( ) ( ) ∑ ∑( )
( )
(3.6)
Expanding the powers and rearranging terms we obtain:
( ) ∑∑
(3.7)
where
( )
( ) ∑ ∑( )
( )
(3.8)
denotes an independent coefficient, and
( )
∑( )
( )
( )
(3.9)
denotes a linear coefficient associated to the term .
Replacing Eq. (3.7) in Eq. (3.2) we obtain the following expression for the null hypothesis
(homoscedasticity with respect to in the range [ ]):
(3.10)
which necessarily implies:
(3.11)

using H-values
Hugo Hernandez
ForsChem Research
In this sense, the alternative hypothesis (heteroscedasticity with respect to in the range
[ ]) can be represented by:
[ ]
(3.12)
Since the function ( ) is unknown, we may estimate the parameter values using the
following multiple linear regression model (from Eq. 2.11 and 3.7):
| ̅| ∑∑
(3.13)
Now, the homoscedastic model would be (from Eq. 3.11 and 3.13):
| ̅|
(3.14)
representing a constant model of the absolute deviations observed in variable .
Let us recall that a direct comparison between the goodness of fit of any model (as in Eq. 3.13)
and the goodness of fit of the corresponding constant model (Eq. 3.14) is given by the
coefficient [26]. For this reason, we will consider the coefficient of the multiple linear
regression model (3.13) as the statistic for testing scedasticity:
∑
∑
(3.15)
On the other hand, notice that the condition for homoscedasticity of with respect to each
reference variable is independent of any other reference variable considered in . For that
reason, Eq. (3.13) will be considered for a single reference variable , thus simplifying the
evaluation procedure:
| ̅| ∑
(3.16)
The value of parameter can be arbitrarily chosen. A large value provides more flexibility in
the description of the heteroscedastic model, but it also reduces the degrees of freedom for
evaluating the residual error, and therefore, limiting the sample size. In this work, a value of
will be arbitrarily considered.

using H-values
Hugo Hernandez
ForsChem Research
3.2. R2 Distribution and P-value Calculation
Even if the behavior of the observed variable is homoscedastic with respect to reference
variable , sampling error will result in a non-zero coefficient. Particularly for small
samples, the probability of obtaining data from a homoscedastic variable resembling a
heteroscedastic pattern is relatively high. In fact, it is always possible to obtain for a
homoscedastic variable, simply as a result of sampling error. Of course, the probability of
reaching decreases as the sample size increases. For example, if we consider the series
expansion truncated after , any data set with different observations of (not
considering repeated observations) will always fit the heteroscedastic model (Eq. 3.16) with
.
For this reason, it is necessary to know the typical behavior of the statistic due to sampling
error of a homoscedastic variable. It is possible to derive a probability distribution model for
from Eq. (3.15) after assuming the distribution of residual errors and (e.g. assuming
normal distributions). Particularly, will follow the distribution of the observed variable .
Since we are interested in a general test method, it is not advisable to exclusively assume
normality. Also, a general analytical distribution might be difficult to obtain. Thus, an empirical
approach will be used, where different types of homoscedastic variables are independently
sampled and the behavior of the statistic is observed.
The Monte Carlo method [27] used to obtain the data was implemented in R language
(https://cran.r-project.org/) and is presented in Appendix 1. The procedure is the following:
 Different sample sizes are considered between and .

 For each sample size, independent pairs of standard uniform random samples
are obtained. Each pair of samples represents a uniform reference variable ( ), and a
uniform homoscedastic variable ( ).
 From each sample of the uniform reference variable ( ), three additional samples of
other distributions are obtained:
o Standard Normal distribution:
√ ( )
(3.17)
o Standard Exponential distribution:
( )
(3.18)
o Non-normal, symmetric distribution [28]:
(3.19)
 Four multiple regression models are fitted corresponding to:

using H-values
Hugo Hernandez
ForsChem Research
| ̅ | ∑
(3.20)
| ̅ | ∑
(3.21)
| ̅ | ∑
(3.22)
| ̅ | ∑
(3.23)
 The coefficient of each regression model is determined using Eq. (3.15).
 Different quantile values are obtained from each set of values, describing the
cumulative probability function of the statistic.
The cumulative probability functions obtained for the statistic are summarized in Figure 1.
Figure 1. Cumulative probability functions of the statistic for different normal and non-normal
homoscedastic distributions and different sample sizes, obtained by Monte Carlo simulation.

using H-values
Hugo Hernandez
ForsChem Research
From these results we may conclude the following:
 Sample size has a strong effect on the distribution.

 The type of distribution being sampled has only a minor effect on the distribution.
This is good, as it shows that a general test method is possible.
 The effect of the type of the original distribution is mainly observed for small sample
sizes and seems negligible for large sample sizes.
Figure 2 illustrates the behavior of selected percentiles as functions of sample size.
Figure 2. Effect of sample size on selected percentile values for the statistic of different
homoscedastic distributions.
The percentile values obtained were relatively similar for the different distributions, except for
higher quantiles (greater than ). Particularly for the median ( ), an empirical function can
be obtained for describing the effect of sample size, independently of the original distribution:

using H-values
Hugo Hernandez
ForsChem Research
( )
(3.24)
The performance of this empirical function is illustrated in Figure 3.
Figure 3. Plot of empirical equation (3.24) for describing the median of the statistic for any arbitrary
homoscedastic distribution as a function of sample size. Blue line: Eq. (3.24). Black dots: Monte Carlo
simulation data for four different distributions.
The percentiles of the statistic, and particularly the deciles, can be used to obtain
approximate probability density functions by means of cubic splines [29]. Some examples of
the probability density functions obtained for different sample sizes, considering a normal
homoscedastic distribution, are shown in Figure 4.
Figure 4. Probability density functions for the statistic of a normal homoscedastic variable, for
selected sample sizes, approximated by cubic splines.

using H-values
Hugo Hernandez
ForsChem Research
Since very different shapes of the probability density functions are obtained, somehow
resembling the different shapes observed in the binomial distribution, a variable
transformation is proposed. Such transformation is precisely inspired by the standardization of
the binomial distribution, but using the median value determined by Eq. (3.24). The original
bounded statistic then becomes the following unbounded variable :
√ ( )
(3.25)
Figure 5 illustrates the cumulative probability obtained for the statistic considering different
homoscedastic distributions and different sample sizes.
Figure 5. Cumulative probability functions of the statistic (Eq. 3.25) for different normal and non-
normal homoscedastic distributions and different sample sizes, obtained by Monte Carlo simulation.
Notice that all curves now resemble the cumulative sigmoid function typically found in normal
distributions. Also, with the sole exception of small sample sizes, we may also conclude that
the behavior of the statistic is independent of the original homoscedastic distribution.
Thus, the following normal probability density function is suggested to describe in general the
Monte Carlo simulation data:

using H-values
Hugo Hernandez
ForsChem Research
( )
( )
√ ( )
(3.26)
where ( ) is a sample size-dependent dispersion parameter. The values of the dispersion

parameter can be estimated from the corresponding cumulative probability function:
( )
√ ( )
( )
(3.27)
The values of ( ) obtained after fitting to Eq. (3.27) simultaneously all four homoscedastic
distributions considered in the simulation are summarized in Table 1.
Table 1. Estimated dispersion parameter values of the statistic obtained for different sample sizes,
considering all homoscedastic distributions
Fitness
6 0.89250 0.85089 0.72401 99.15%
7 0.69595 0.78654 0.61865 99.67%
8 0.56902 0.69714 0.48600 99.80%
9 0.48103 0.62492 0.39053 99.85%
10 0.41670 0.56331 0.31732 99.90%
15 0.25107 0.39714 0.15772 99.95%
20 0.18069 0.32530 0.10582 99.95%
25 0.14151 0.28353 0.08039 99.95%
30 0.11646 0.25347 0.06425 99.93%
35 0.09901 0.23224 0.05394 99.94%
40 0.08614 0.21562 0.04649 99.93%
45 0.07626 0.20234 0.04094 99.93%
50 0.06842 0.19143 0.03664 99.93%
60 0.05676 0.17349 0.03010 99.92%
70 0.04851 0.16060 0.02579 99.91%
80 0.04236 0.14963 0.02239 99.92%
90 0.03760 0.14078 0.01982 99.91%
100 0.03380 0.13339 0.01779 99.92%
125 0.02698 0.11957 0.01430 99.91%
150 0.02246 0.10917 0.01192 99.91%
200 0.01682 0.09420 0.00887 99.91%
250 0.01344 0.08432 0.00711 99.91%
300 0.01120 0.07705 0.00594 99.91%
500 0.00671 0.05973 0.00357 99.90%
1000 0.00335 0.04236 0.00179 99.91%

using H-values
Hugo Hernandez
ForsChem Research
In all cases, the fitness of the normal distribution was greater than , indicating a suitable
model for describing the statistic. Now, the variance of the statistic can be empirically
fitted in terms of ( ) by the following polynomial model:
( ) ( ) ( ( )) ( ( ))
(3.28)
The performance of Eq. (3.28) is depicted in Figure 6.
Figure 6. Plot of empirical equation (3.28) for describing the variance of the statistic for any arbitrary
homoscedastic distribution as a function of sample size. Green line: Eq. (3.28). Blue dots: Values fitted
from Monte Carlo data (Table 1).
The probability density function of becomes (from the change of variable theorem [30]):
( ( ))
( ( ) ( )) ( )
)| | ( )
( ) (
( ( )) √ ( )
(3.29)
and the corresponding cumulative probability function is:
( )
√ ( ) ( )
( )
( )
(3.30)
where ( ) is approximately given by Eq. (3.24) and ( ) is approximately given by Eq.

(3.28).

using H-values
Hugo Hernandez
ForsChem Research
Now, let us consider a sample of size obtained from an arbitrary homoscedastic distribution
having a value given by Eq. (3.15), representing the coefficient of model (3.13). The
probability of falsely concluding that the data is heteroscedastic (type I error) is given by:
( )
√ ( ) ( )
( )
( )
(3.31)
Using this -value and a suitable pre-defined significance level ( ), a decision about the
homoscedasticity of the data can be made (Eq. 1.2 or 1.3).
The implementation of the proposed method for testing homoscedasticity in R language is

included in Appendix 2.
3.3. Power Assessment
The power of a test reflects the success in rejecting the null hypothesis when it is actually false.
In our case, the power of a scedasticity test reflects the rate of success for correctly
determining heteroscedasticity. Unfortunately, while there is a single model for
homoscedasticity, there are an unlimited number of heteroscedastic models. For that reason,
an exact determination of the power of this test is difficult if not impossible.
A common approach for assessing the power of statistical tests is Monte Carlo simulation [27].
In particular the following strategy is used:
 For each sample size, independent pairs of samples ( ) following a standard

uniform random variable are generated.
 From each pair of samples, one uniform random sample ( ) is used to generate the
corresponding standard normal ( ), exponential and distribution ( ). The bias of
the standard uniform ( ) and exponential ( ) distributions is removed by
subtracting the corresponding mean value.
 The other uniform sample ( ) is used to describe the non-constant behavior of the
standard deviation. different heteroscedastic models ( ( ) ) are considered.
 The homoscedasticity test is performed between each combination ( ) and .
 The -value obtained for each test is stored.
 For a given significance level, the power of the test for each sample size can be
determined as the percentage of true positives obtained in the simulation data.

using H-values
Hugo Hernandez
ForsChem Research
The R language implementation of the Monte Carlo procedure for assessing power is shown in
Appendix 3.
The 8 different heteroscedastic models considered (arbitrarily chosen) are the following:
 Linear model:
( )
(3.32)
 Parabolic “hourglass” model:
( ) ( )
(3.33)
 Parabolic “football” model:
( ) ( )
(3.34)
 Square root model:
( ) √
(3.35)
 Reciprocal model:
( )
(3.36)
 Logarithm model:
( ) ( )
(3.37)
 Sinusoidal model:
( ) ( )
(3.38)
 Gaussian model:
( )
( )
√
(3.39)
The graphical behavior of these models is depicted in Figure 7.
Notice that the power ( ) of the test depends on the significance level ( ) used in the test,
and the sample size ( ). Thus, the total test error ( ( ) ) will be:
( ) ( )
(3.40)

using H-values
Hugo Hernandez
ForsChem Research
Figure 7. Graphical representation of the heteroscedastic models used to assess the power of the test
3.4. Optimal Significance Level and H-value
We may now define an optimal significance level ( ( )) for testing homoscedasticity in samples
of size , as the significance level value that minimizes the total test error probability
( ( ) ) [31].
The optimal significance level can be obtained by solving the following optimization problem:
( ) ( ( ))
(3.41)
where ( ) is the proportion of false negatives, that is, the proportion of tests showing
homoscedasticity knowing that the original distribution is heteroscedastic. It is simply obtained
by counting results with for samples of size obtained from heteroscedastic data.
The optimization procedure implemented in R is shown in Appendix 4. A gradient-based

optimization method is employed, always using the classical significance level as starting
point.
Optimal significance levels, power, and total error found for different sample sizes using the
Monte Carlo assessment proposed in the previous Section are shown in Table 2. These results
are also graphically summarized in Figure 8.

using H-values
Hugo Hernandez
ForsChem Research
Table 2. Optimal significance level and test error for different sample sizes
Sample Size Optimal Significance Power Type II Total Error
( ) Level ( ) ( ) Error ( ) ( )
6 2.24% 12.87% 87.13% 89.37%
7 4.14% 13.53% 86.47% 90.60%
8 5.95% 14.65% 85.35% 91.30%
9 7.65% 15.54% 84.46% 92.11%
10 22.13% 30.38% 69.62% 91.75%
15 37.44% 54.02% 45.98% 83.41%
20 30.07% 53.36% 46.64% 76.70%
25 25.36% 54.77% 45.23% 70.60%
30 23.06% 57.81% 42.19% 65.25%
35 20.21% 59.19% 40.81% 61.02%
40 17.28% 60.43% 39.57% 56.85%
45 15.36% 61.81% 38.19% 53.54%
50 15.02% 64.33% 35.67% 50.70%
60 11.95% 66.23% 33.77% 45.72%
70 10.19% 68.08% 31.92% 42.12%
80 8.71% 69.79% 30.21% 38.92%
90 7.23% 70.72% 29.28% 36.51%
100 6.29% 71.85% 28.15% 34.44%
125 4.48% 73.50% 26.50% 30.98%
150 3.33% 74.59% 25.41% 28.74%
200 1.95% 75.68% 24.32% 26.27%
250 1.43% 76.36% 23.64% 25.06%
300 1.22% 76.71% 23.29% 24.52%
500 2.37% 79.58% 20.42% 22.79%
1000 3.74% 86.02% 13.98% 17.72%
Figure 8. Effect of sample size on homoscedasticity test errors
In these results we observe the following behavior:
 For small sample sizes ( ) the risk of false negatives is larger than . This
means that most heteroscedastic data will not be detected by the test. Also, due to the
large type II error, the optimal error is obtained for relatively low significance level
values, which are increasing with sample size.

using H-values
Hugo Hernandez
ForsChem Research
 As sample size increases (for ), the total error probability (as well as type I and
type II error) decreases almost exponentially.
 For very large sample sizes ( ) a slight increase in optimal significance level is
observed, compensated by an additional decrease in type II error and total error.
Neglecting the results obtained for small sample sizes ( ), the optimal significance levels
obtained can be approximately fitted by the following empirical function:
( ) ( )
(3.41)
graphically represented in Figure 9.
Figure 9. Optimal significance level as a function of sample size. Blue dots: Obtained from Monte Carlo
simulation data. Green line: Approximation given by Eq. (3.41).
For small sample sizes ( ), the use of Eq. (3.41) implies a higher risk of false positives, and
of course, a higher total error. However, since total error was already above , the
difference is practically irrelevant. Basically, we cannot confidently detect heteroscedasticity in
data having small sample sizes. Thus, the proposed scedasticity test should only be used for
larger sample sizes ( ).
The -value of the test can be directly compared to the optimal significance level using a new
metric denoted as -value (homoscedasticity value), analogous to the -value (normality
value) proposed in a previous report [32]. The -value is then defined as follows:
( ) ( )
( )
(3.42)
If then the data is homoscedastic. If then the data is heteroscedastic. If , the

test is inconclusive and more data is needed.

using H-values
Hugo Hernandez
ForsChem Research
It is also possible to determine critical values for the optimal significance levels given in Eq.
(3.41). The critical values ( ( ) ) are implicitly given by the following expression:
( ) ( )
√ ( ) ( ( ))
√ ( )( ( ))
(3.43)
where ( ) is given by Eq. (3.24) and ( ) is given by Eq. (3.28).
The resulting values of ( ) obtained by numerical solution of Eq. (3.43) are shown in Figure
10 along with the following empirical approximation for ( ):
( )
(3.44)
Figure 10. Critical R2 values determined at the optimal significance level for different sample sizes.
3.5. Testing for Homoscedasticity vs. Testing for Heteroscedasticity
The optimal significance levels found in the previous Section minimize the total test error
resulting from adding type I and type II errors. Thus, they can be used to evaluate the scedastic
character of the data, minimizing the risk of both false positives and false negatives.
However, we can also be particularly interested in testing either for homoscedasticity or

heteroscedasticity of the data. If we test for homoscedasticity3, then the relative weight of
false positives should be greater than that of false negatives. On the contrary, if we test for
heteroscedasticity, then the relative weight of false negatives should be greater than that of
false positives.
3
For example, when checking the homoscedasticity assumption in regression analysis.

using H-values
Hugo Hernandez
ForsChem Research
If for example we want to minimize the risk of false positives we might simply set ,
completely suppressing all positive results, either false or true. But this is not the purpose of a
statistical test. We must always take into account both types of error.
Thus, let us define the following weighted test errors:
( ) ( )
( )
( )
(3.45)
( ) ( )
( )
( )
(3.46)
where is a relative weight factor, and ( ) and ( ) are optimal significance levels
for the homoscedasticity and heteroscedasticity tests, respectively, described by the following
empirical expressions:
( ) ( )
(3.47)
( )
( )
(3.48)
is a correction factor, and ( ) is the optimal significance level found for the general scedastic
test.
Considering a relative weight , the sum of errors for the homoscedastic and
heteroscedastic tests are minimized with an optimal correction factor .
The corresponding critical R2 values for the homoscedastic and heteroscedastic tests can be
described by the following empirical expressions4:
( ) ( )
(3.49)
( ) ( )
(3.50)
The final scedasticity test function implemented in R, determining the optimal significance
level, critical R2 value, and corresponding H-value is presented in Appendix 5.
4
Limited to the range [ ]

using H-values
Hugo Hernandez
ForsChem Research
4. Comparative Evaluation with different Tests
In this Section, a comparative Monte Carlo assessment of different scedasticity evaluation

methods is performed. The tests considered include the method proposed in Section 3
(denoted as ) and different tests included in the skedastic package in R and described in
detail by Farrar [33]. The list of scedasticity evaluation methods considered, available in
skedastic, is summarized in Table 3 5.
Table 3. List of scedasticity tests available in the skedastic R package [33] used as comparative reference
Heteroscedasticity Test R function Acronym
Anscombe anscombe ANS
Bickel bickel BIC
Breusch-Pagan breusch_pagan B-P
Carapeto-Holt carapeto_holt C-P
Cook-Weisberg cook_weisberg C-W
Diblasi-Bowman diblasi_bowman D-B
Dufour, Khalaf, Bernard, and Genest dufour_etal DUF
Evans-King evans_king E-K
Glejser glejser GLE
Godfrey-Orme godfrey_orme G-O
Goldfeld-Quandt goldfeld_quandt
GQN
(Nonparametric) (method="nonparametric")
Goldfeld-Quandt goldfeld_quandt
GQP
(Parametric) (method="parametric")
Harrison-McCabe harrison_mccabe HMC
Harvey harvey HAR
Honda honda HON
Li-Yao li_yao L-Y
Račkauskas-Zuokas rackauskas_zuokas R-Z
Ramsey (Bartlett's MSET) bamset RBM
Simonoff-Tsai simonoff_tsai S-T
Szroeter szroeter SZR
Verbyla verbyla VER
White white WHI
Wilcox-Keselman wilcox_keselman W-K
Yüce yuce YUC
Zhou, Song, and Thompson zhou_etal ZST
According to Farrar the best performers are the Evans and King test (best deflator-based test)
and the Verbyla test (best omnibus test) [33], considering sample sizes between and .
Nevertheless, all methods are considered again in the current comparative evaluation.
5
Due to high computational load, the Horn test, originally included in the skedastic package, was
replaced by the nonparametric Goldfeld-Quandt test.

using H-values
Hugo Hernandez
ForsChem Research
4.1. Normal and Uniform Data
The evaluation data is obtained by Monte Carlo simulation as follows:
 independent pairs of homoscedastic samples are randomly obtained for each

sample size considered (between and ). pairs correspond to a uniform
random response variable and a uniform random reference variable, and pairs
correspond to a standard normal random response variable and a uniform random
reference variable. No other types of distributions were considered at this stage6.
 All scedasticity tests are used to evaluate this homoscedastic data, and the -values
are recorded. This data will be used to assess type I error (false positives), as the
fraction of tests with .
 Then, the heteroscedastic models considered in Figure 7 are used to transform the
response variable, testing again for scedasticity using the same methods. The -
values obtained will be used to assess type II error (false negatives), as the fraction of
tests with .
 Two scenarios are considered for the significance level of the tests:
o Conventional constant significance level ( ).
o Optimal significance levels for scedasticity test ( ( ) ( ) ).
 The criterion for test performance is the total test error resulting from the addition of
type I and type II errors.
All test functions were executed using their default parameters values 7. The R code employed
for this evaluation is shown in Appendix 6.
The performance results obtained are graphically summarized in Figure 11 (for ) and
Figure 12 (for ( ) ( ) ).
Three different groups of tests can be observed:
 Tests with total error greater than for all sample sizes: Bickel test.
 Tests with total error between and about for all sample sizes: Carapeto-Holt,
Diblasi-Bowman, Evans-King, Goldfeld-Quandt (parametric and nonparametric),
Harrison-McCabe, Honda, Li-Yao, Račkauskas-Zuokas, Ramsey (Bartlett's MSET),
Szroeter, Wilcox-Keselman, and Yüce tests.
6
Only the most common types of distributions were considered (uniform and normal), also taking into
account that the behavior of the uniform distribution is not so different to that of the normal
distribution, particularly for small samples.
7
For the non-parametric Goldfeld-Quandt test, the method was changed from default (“parametric”) to
“nonparametric”. For the Monte Carlo methods of Dufour et al. and Godrey-Orme, the input argument
hettest was set to “breusch_pagan”.

using H-values
Hugo Hernandez
ForsChem Research
 Tests with total error below for large sample sizes: Anscombe, Breusch-Pagan,
Cook-Weisberg, Dufour-Khalaf-Bernard-Genest, Glejser, Godfrey-Orme, Harvey, H-value
(proposed in Section 3), Simonoff-Tsai, Verbyla, White, and Zhou-Song-Thompson tests.
Figure 11. Comparative total test error obtained for different heteroscedasticity tests and different
sample sizes obtained from normal or uniform distributions. Constant significance level: .
sample sizes obtained from normal or uniform distributions. Optimal significance level: ( )
( ) .

using H-values
Hugo Hernandez
ForsChem Research
Particularly the Bickel test resulted in total test error values greater than for all sample
sizes. Notice that the total test error, defined as the sum of type I and type II errors, can be
greater than , but not greater than . While the type II error for the Bickel test was
within typical levels, the type I error was almost for all sample sizes. For some unknown
reason, the Bickel test R function was not able to correctly identify homoscedastic data.
Clearly, the first two groups were unable to provide a satisfactory performance in the
evaluation of homoscedasticity or heteroscedasticity of normal and uniform data, even for
relatively large samples. The third group, on the contrary, not only had lower total error values,
but error also decreased with sample size, as expected from a statistical test. Despite the
similar results between the tests of the third group, the best overall performance was achieved
by the Zhou-Song-Thompson test [34], closely followed by the Simonoff-Tsai [35] and the
Verbyla tests [36].
We may also conclude that no great differences in tests performance were observed between
the constant and the optimal significance level. Nevertheless, both scenarios will be
considered in the next evaluation stage.
4.2. Non-Normal / Non-Uniform Data
This next stage consists in evaluating non-normal, non-uniform data. The tests under evaluation
will be limited to the scedasticity tests of the third group, which provided a satisfactory
performance for normal and uniform data. The evaluation procedure was similar to the one
used in the previous stage, but only considering independent sample pairs per distribution
model ( sample pairs in total), with sample sizes between and . The non-normal,
non-uniform distribution models considered for evaluating type I error were the following:
 Standard Exponential distribution (Eq. 3.18).

 Non-normal, symmetric distribution (Eq. 3.19).
 Non-normal, symmetric distribution:
(4.1)
 Log-normal distribution:
( )
(4.2)
All heteroscedastic models considered in Figure 7 were used again to transform the response
variable for evaluating type II error. The R code employed for this evaluation is shown in
Appendix 7.

using H-values
Hugo Hernandez
ForsChem Research
The results obtained are graphically summarized in Figure 13 (constant significance level,
) and Figure 14 (optimal significance level, ( ) ( ) ).
sample sizes obtained from non-normal, non-uniform distributions. Constant significance level: .
sample sizes obtained from non-normal, non-uniform distributions. Optimal significance level:
( ) ( ) .

using H-values
Hugo Hernandez
ForsChem Research
In general, we observe a deteriorated performance compared to the normal or uniform

distributions. We may also notice that larger sample sizes are required to significantly reduce
the tests error. For example, the Cook-Weisberg, Harvey, Simonoff-Tsai, Verbyla and White
tests were not able to show total errors below in samples up to .
For non-normal, non-uniform distributions, the best performance was again achieved by the
Zhou-Song-Thompson test [34], but his time followed by the Glejser test [21] and the H-value
test proposed in Section 3.
5. Conclusion
Evaluating the scedastic behavior of any arbitrary data set remains challenging. While different
tests provide a reliable evaluation of scedasticity in large samples obtained from normal
distributions, in the case of non-normal distributions and small samples, total test errors
beyond are commonly found.
In general, the best three methods found to provide the lowest error in the evaluation of
scedasticity for arbitrary data sets were:
1. Zhou-Song-Thompson test [34]
2. Glejser test [21]
3. H-value test (Section 3)
Of course, these conclusions are not absolute, as they were obtained from random samples of
selected probability distributions, and considering some arbitrary heteroscedasticity models.
Different conclusions might be obtained using different data sets.
The comparative analysis has shown that the H-value scedasticity test proposed in this report
provided a competitive performance for both normal and non-normal distributions, with the
advantage of providing decisions independently of some confidence criterion provided by the
analyst. Also, the test decision is easily interpreted as follows: If then the data is
homoscedastic. If then the data is heteroscedastic. If , the test is inconclusive, and
more data is needed.
The -value test can be used for evaluating the homoscedasticity assumption of residuals,
required in regression analysis. In this case, the optimal significance levels change (decrease) as
the risk of false positives is further reduced.
Also, the -value test can be used for evaluating heteroscedasticity, and in this case the
optimal significance levels increase, reducing the risk of false negatives.

using H-values
Hugo Hernandez
ForsChem Research
Acknowledgment and Disclaimer

This report provides data, information and conclusions obtained by the author(s) as a result of original
scientific research, based on the best scientific knowledge available to the author(s). The main purpose
of this publication is the open sharing of scientific knowledge. Any mistake, omission, error or inaccuracy
published, if any, is completely unintentional.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-
for-profit sectors.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC
4.0). Anyone is free to share (copy and redistribute the material in any medium or format) or adapt
(remix, transform, and build upon the material) this work under the following terms:
 Attribution: Appropriate credit must be given, providing a link to the license, and indicating if
changes were made. This can be done in any reasonable manner, but not in any way that
suggests endorsement by the licensor.
 NonCommercial: This material may not be used for commercial purposes.
References
[1] Pearson, K. (1905). Mathematical Contributions to the Theory of Evolution. XIV. On the General
Theory of Skew Correlation and Non-Linear Regression. Dulau and Co., London.
http://www.archive.org/details/cu31924003092917.
[2] Merriam-Webster. (n.d.). Homoscedasticity. In Merriam-Webster.com dictionary. Retrieved
November 14, 2023, from https://www.merriam-webster.com/dictionary/homoscedasticity.
[3] Frost, J. (2019). Regression Analysis. An Intuitive Guide for Using and Interpreting Linear Models.
Statistics by Jim Publishing. https://statisticsbyjim.com/regression/regression-analysis-intuitive-
guide/.
[4] Hernandez, H. (2023). Heteroscedastic Regression Models. ForsChem Research Reports, 8, 2023-
08, 1 - 29. doi: 10.13140/RG.2.2.31538.58562.
[5] Hernandez, H. (2023). Optimal Model Structure Identification. 1. Multiple Linear Regression.
ForsChem Research Reports, 8, 2023-13, 1 - 53. doi: 10.13140/RG.2.2.31051.57121.
[6] Hernandez, H. (2021). Optimal Significance Level and Sample Size in Hypothesis Testing. 6. Testing
Regression. ForsChem Research Reports, 6, 2021-11, 1-37. doi: 10.13140/RG.2.2.21739.46888.
[7] Hernandez, H. (2020). Formulation and Testing of Scientific Hypotheses in the presence of
Uncertainty. ForsChem Research Reports, 5, 2020-01, 1-16. doi: 10.13140/RG.2.2.36317.97767.
[8] Ali, M. M., & Giaccotto, C. (1984). A study of several new and existing tests for heteroscedasticity
in the general linear model. Journal of Econometrics, 26(3), 355-373. doi: 10.1016/0304-
4076(84)90026-5.
[9] Lyon, J. D., & Tsai, C. L. (1996). A comparison of tests for heteroscedasticity. Journal of the Royal
Statistical Society: Series D, 45(3), 337-349. http://www.jstor.com/stable/2988471.
[10] Walpole, R. E., Myers, R. H., Myers, S. L., & Ye, K. (2012). Probability & Statistics for Engineers &
Scientists. 9th Edition. Prentice Hall, Boston. Section 10.10 One- and Two-Sample Tests Concerning
Variances. pp. 366-370. ISBN 978-0-321-62911-1.
[11] Hernandez, H. (2021). Optimal Significance Level and Sample Size in Hypothesis Testing. 2. Tests of
Variances. ForsChem Research Reports, 6, 2021-07, 1-34. doi: 10.13140/RG.2.2.11266.20161.

using H-values
Hugo Hernandez
ForsChem Research
[12] Fisher, R. A. (1924). On a Distribution Yielding the Error Functions of Several Well Known Statistics.
Proceedings of the International Congress of Mathematics (Toronto), 2, 805-813.
https://hekyll.services.adelaide.edu.au/dspace/bitstream/2440/15183/1/36.pdf.
[13] Hosken, D. J., Buss, D. L., & Hodgson, D. J. (2018). Beware the F test (or, how to compare
variances). Animal Behaviour, 136, 119-126. doi: 10.1016/j.anbehav.2017.12.014.
[14] Hartley, H. O. (1950). The maximum F-ratio as a short-cut test for heterogeneity of variance.
Biometrika, 37 (3/4), 308-312. doi: 10.2307/2332383.
[15] Goldfeld, S. M., & Quandt, R. E. (1965). Some tests for homoscedasticity. Journal of the American
Statistical Association, 60 (310), 539-547. doi: 10.1080/01621459.1965.10480811.
[16] Levene, H. (1960). Robust Tests for Equality of Variances. In: Olkin, I., Ghurye, S. G., Hoeffding, W.,
Madow, W. G., & Mann, H. B. Contributions to Probability and Statistics. Essays in honor of Harold
Hotelling. Stanford University Press, Stanford, California. pp. 278-292.
https://archive.org/details/contributionstop0000unse_d2c5.
[17] Hernandez, H. (2021). Variance Decomposition in Unbalanced Data. ForsChem Research Reports,
6, 2021-01, 1-35. doi: 10.13140/RG.2.2.16789.35043.
[18] Brown, M. B., & Forsythe, A. B. (1974). Robust tests for the equality of variances. Journal of the
American Statistical Association, 69 (346), 364-367. doi: 10.1080/01621459.1974.10482955.
[19] Bartlett, M. S. (1937). Properties of sufficiency and statistical tests. Proceedings of the Royal
Society of London. Series A - Mathematical and Physical Sciences, 160 (901), 268-282. doi:
10.1007/978-1-4612-0919-5_8.
[20] Park, R. E. (1966). Estimation with heteroscedastic error terms. Econometrica, 34 (4), 888.
https://www.proquest.com/openview/765be3b63acac473ac71d548606dc880/1.
[21] Glejser, H. (1969). A new test for heteroskedasticity. Journal of the American Statistical
Association, 64 (325), 316-323. doi: 10.1080/01621459.1969.10500976.
[22] White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for
heteroskedasticity. Econometrica: Journal of the Econometric Society, 48 (4), 817-838.
https://www.jstor.org/stable/1912934.
[23] Cook, R. D., & Weisberg, S. (1983). Diagnostics for heteroscedasticity in regression. Biometrika,
70(1), 1-10. https://www.jstor.org/stable/2335938.
[24] Breusch, T. S., & Pagan, A. R. (1979). A simple test for heteroscedasticity and random coefficient
variation. Econometrica: Journal of the Econometric Society, 47 (5), 1287-1294.
https://www.jstor.org/stable/1911963.
[25] Hernandez, H. (2022). Normal Distribution and Transcendental Functions: Mathematical and
Historical Relations. ForsChem Research Reports, 7, 2022-04, 1 - 47. doi:
10.13140/RG.2.2.21535.23203.
[26] Hernandez, H. (2023). Replacing the R² Coefficient in Model Analysis. ForsChem Research Reports,
8, 2023-10, 1 - 40. doi: 10.13140/RG.2.2.26570.13769.
[27] Thomopoulos, N. T. (2013). Essentials of Monte Carlo simulation: Statistical Methods for Building
Simulation Models. Springer Science+Business Media, New York. doi: 10.1007/978-1-4614-6022-0.
[28] Hernandez, H. (2023). Representative Functions of the Standard Normal Distribution. ForsChem
Research Reports, 8, 2023-01, 1 - 29. doi: 10.13140/RG.2.2.29607.83362.
[29] Hernandez, H. (2020). Reconstructing Probability Distributions using Quantile-based Splines.
ForsChem Research Reports, 5, 2020-21, 1-23. doi: 10.13140/RG.2.2.14827.36645.
[30] Hernandez, H. (2017). Multivariate Probability Theory: Determination of Probability Density
Functions. ForsChem Research Reports, 2, 2017-13, 1-13. doi: 10.13140/RG.2.2.28214.60481.

using H-values
Hugo Hernandez
ForsChem Research
[31] Hernandez, H. (2021). Optimal Significance Level and Sample Size in Hypothesis Testing. 7.
Implementation Remarks. ForsChem Research Reports, 6, 2021-12, 1-27. doi:
10.13140/RG.2.2.23632.64000.
[32] Hernandez, H. (2021). Testing for Normality: What is the Best Method? ForsChem Research
Reports, 6, 2021-05, 1-38. doi: 10.13140/RG.2.2.13926.14406.
[33] Farrar, T. (2022). Handling Heteroskedasticity in the Linear Regression Model. Doctoral
Dissertation. University of the Western Cape, South Africa. http://hdl.handle.net/11394/9532.
[34] Zhou, Q. M., Song, P. X. K., & Thompson, M. E. (2015). Profiling Heteroscedasticity in Linear
Regression Models. Canadian Journal of Statistics, 43 (3), 358-377. doi: 10.1002/cjs.11252.
[35] Simonoff, J. S., & Tsai, C.-L. (1994). Use of modified profile likelihood for improved tests of
constancy of variance in regression. Journal of the Royal Statistical Society: Series C (Applied
Statistics), 43 (2), 357-370. doi: 10.2307/2986026.
[36] Verbyla, A. P. (1993). Modelling variance heterogeneity: residual maximum likelihood and
diagnostics. Journal of the Royal Statistical Society: Series B (Methodological), 55 (2), 493-508.
doi: 10.1111/j.2517-6161.1993.tb01918.x.
Appendix. Implementation in R language
A.1. Monte Carlo Method for the Determination of R2 distributions

N=100000
n=c(6:10,15,20,25,30,35,40,45,50,60,70,80,90,100,125,150,200,250,300,500,1000)
L=length(n)
alpha=c(0,1e-4,1e-3,(1:40)/200,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.95,0.99,0.999,0.9999,1)
qN=matrix(NA,nrow=length(alpha),ncol=L)
qU=matrix(NA,nrow=length(alpha),ncol=L)
qN5=matrix(NA,nrow=length(alpha),ncol=L)
qE=matrix(NA,nrow=length(alpha),ncol=L)
for (j in 1:L){
R2U=rep(NA,N)
R2N=rep(NA,N)
R2N5=rep(NA,N)
R2E=rep(NA,N)
for (i in 1:N){
yU=runif(n[j])
yN=qnorm(yU)
yN5=yN^5
yE=-log(1-yU)
x=runif(n[j])
x2=x^2
x3=x^3
x4=x^4
yU=abs(yU-mean(yU))
yN=abs(yN-mean(yN))
yN5=abs(yN5-mean(yN5))
yE=abs(yE-mean(yE))
R2U[i]=summary(lm(yU~x+x2+x3+x4))$r.squared
R2N[i]=summary(lm(yN~x+x2+x3+x4))$r.squared
R2N5[i]=summary(lm(yN5~x+x2+x3+x4))$r.squared
R2E[i]=summary(lm(yE~x+x2+x3+x4))$r.squared
}
qU[,j]=quantile(R2U,1-alpha)
qN[,j]=quantile(R2N,1-alpha)
qN5[,j]=quantile(R2N5,1-alpha)
qE[,j]=quantile(R2E,1-alpha)
}

using H-values
Hugo Hernandez
ForsChem Research
A.2. H-value Scedasticity Test: Basic Function
H.test<-function(y,x){
n=length(y)
if (n>5){
x=x[1:n]
x2=x^2
x3=x^3
x4=x^4
y=abs(y-mean(y))
R2=summary(lm(y~x+x2+x3+x4))$r.squared
R2m=3.35/n+2.38/n^2+57.9/n^3
S2=0.37*R2m+1.46*R2m^2-1.07*R2m^3
Z=(R2-R2m)/sqrt(S2*R2*(1-R2))
P=pnorm(Z,lower=FALSE)
} else {
P=1
}
return(P)
}
A.3. Monte Carlo Method for the Evaluation of Test Power
N=10000
n=c(6:10,15,20,25,30,35,40,45,50,60,70,80,90,100,125,150,200,250,300,500,1000)
alpha=0.05
L=length(n)
HPset=matrix(NA,nrow=32*N,ncol=L)
Tpower=rep(NA,L)
for (j in 1:L){
HP=rep(NA,32*N)
for (i in 1:N){
x=runif(n[j])
yU=runif(n[j])
yN=qnorm(yU)
for (k in 1:8){
if (k==1) hf=1+3*x
if (k==2) hf=1+10*(x-0.5)^2
if (k==3) hf=3.5-10*(x-0.5)^2
if (k==4) hf=1+3*sqrt(2*x)
if (k==5) hf=3/(1+3*x)
if (k==6) hf=1+5*log(1+x)
if (k==7) hf=2+sin(3*pi*x)
if (k==8) hf=1+10*dnorm(10*(x-0.5))
y1=hf*(yU-0.5)
y2=hf*yN
y3=hf*(-log(1-yU)-1)
y4=hf*yN^5
HP[(32*(i-1)+4*(k-1)+1):(32*(i-1)+4*(k-1)+4)]=
c(H.test(y1,x),H.test(y2,x),H.test(y3,x),H.test(y4,x))
}
}
HPset[,j]=HP
Tpower[j]=length(which(HP)<=alpha)/length(HP)
}

using H-values
Hugo Hernandez
ForsChem Research
A.4. Optimization of Significance Level
alphaopt=rep(NA,L)
errTot=rep(NA,L)
for (j in 1:L){
HP=HPset[,j]
save(HP,file="HPtemp.R")
errT<-function(alpha){
load("HPtemp.R")
beta=length(which(HP>alpha))/length(HP)
return(alpha+beta)
}
out=optim(0.05,errT,lower=0,upper=1,method="L-BFGS-B")
alphaopt[j]=out$par
errTot[j]=out$value
}
Notice that variables L and HPset were obtained in the previous code.
A.5. H-value Scedasticity Test: Complete Function
#This function evaluates the scedasticity of a variable y with respect to a set of reference
variables (x).
#Usage: H.sked.test(mainlm)
#Arguments:
#mainlm - Either an object of class "lm" (e.g., generated by lm), a data frame, or a list of
two objects: a response vector (y) and a matrix of reference values (x). These objects must
be given in that order.
#ttype - Type of test performed. Types available: General scedastic ("scedastic"), test for
homoscedasticity ("homoscedastic"), and test for heteroscedasticity ("heteroscedastic").
#Output: Data frame containing the reference variables with their corresponding R2 statistic
value, R2 critical value, P value, H value and test decision.
H.sked.test<-
function(mainlm,ttype=c("scedastic","homoscedastic","heteroscedastic"),display=TRUE){
ttype=ttype[1]
if (display==TRUE){
main="H-value Test of Scedasticity"
if (ttype=="homoscedastic") main="H-value Test of Homoscedasticity"
if (ttype=="heteroscedastic") main="H-value Test of Heteroscedasticity"
cat(main,"\n")
cat("\n")
}
H.test<-function(y,x,ttype=ttype){
n=length(y)
alpha=min(1,max(0,40*(1-exp(-n/30))*n^(-1.4)))
if (ttype=="homoscedastic") alpha=alpha*0.045
if (ttype=="heteroscedastic") alpha=alpha/0.045
if (n>5){
x=x[1:n]
x2=x^2
x3=x^3
x4=x^4
y=abs(y-mean(y))
R2=summary(lm(y~x+x2+x3+x4))$r.squared
R2m=3.35/n+2.38/n^2+57.9/n^3
S2=0.37*R2m+1.46*R2m^2-1.07*R2m^3
Z=(R2-R2m)/sqrt(S2*R2*(1-R2))
P=pnorm(Z,lower=FALSE)
H=log(P/alpha)
} else {

using H-values
Hugo Hernandez
ForsChem Research
P=1
R2=1
H=0
}
if (min(H)<0){
decision="Heteroscedastic"
} else if (min(H)>0) {
decision="Homoscedastic"
} else {
decision="Inconclusive"
}
return(list(p.value=P,r.squared=R2,H.value=H,decision=decision))
}
if (class(mainlm)=="lm"){
y=mainlm$residuals
x=mainlm$model[-1]
if (ncol(x)>1){
x=cbind(mainlm$fitted.values,x)
colnames(x)[1]="ypred"
}
} else if (is.list(mainlm)){
y=mainlm[[1]]
x=mainlm[[2]]
} else if (is.data.frame(mainlm)){
y=mainlm[,1]
x=mainlm[-1]
} else {
stop("Error: Unrecognizable data format")
}
N=ncol(x)
if (is.null(N)) N=1
n=nrow(y)
if (is.null(n)) n=length(y)
if (n<15) warning("The sample size is too small (n<15). Conclusions may be
unreliable",call.=FALSE)
R2cr=min(1,max(0,3.4*n^(-0.84)))
if (ttype=="homoscedastic") R2cr=min(1,max(0,6.4*(1-exp(-n/5.3))*n^(-0.81)))
if (ttype=="heteroscedastic") R2cr=min(1,max(0,(1-exp(-n/58))*n^(-0.62)))
if (N==1){
if (is.data.frame(x)){
name=colnames(x)[1]
x=x[,1]
} else {
name="x"
}
test=H.test(y,x,ttype=ttype)
P=test$p.value
R2=test$r.squared
H=test$H.value
decision=test$decision
} else {
name=colnames(x)
P=rep(NA,N)
R2=rep(NA,N)
R2cr=rep(R2cr,N)
H=rep(NA,N)
decision=rep(NA,N)
for (i in 1:N){
test=H.test(y,x[,i],ttype=ttype)
P[i]=test$p.value
R2[i]=test$r.squared
H[i]=test$H.value
decision[i]=test$decision
}
}
out=data.frame(name,R2,R2cr,P,H,decision)
names(out)=c("ref.var","statistic","crit.value","p.value","H.value","decision")

using H-values
Hugo Hernandez
ForsChem Research
if (display==TRUE){
#Decision
if (min(H)>0){
cat("The data is Homoscedastic","\n")
} else if (min(H)<0){
cat("The data is Heteroscedastic","\n")
} else {
cat("The test is inconclusive","\n")
}
cat("\n")
}
return(out)
}
A.6. Comparative Evaluation: First Stage
library(skedastic)
N=50
n=c(10,15,20,25,30,35,40,45,50,60,70,80,90,100,125)
L=length(n)
HomP=matrix(NA,nrow=2*L*N,ncol=26)
HetP=matrix(NA,nrow=16*L*N,ncol=26)
counter1=0
counter2=0
for (j in 5:L){
for (i in 1:N){
set.seed(NULL)
x=runif(n[j])
yU=runif(n[j])
yN=qnorm(yU)
for (l in 1:2){
print(paste("l =",l))
print(paste("k =",0))
if (l==1) y=(yU-0.5)
if (l==2) y=yN
counter1=counter1+1
HomP[counter1,1]=H.test(y,x)
HomP[counter1,2]=try(suppressMessages(anscombe(list(y,x))$p.value))
HomP[counter1,3]=try(suppressMessages(bamset(list(y,x))$p.value))
HomP[counter1,4]=try(suppressMessages(bickel(list(y,x))$p.value))
HomP[counter1,5]=try(suppressMessages(breusch_pagan(list(y,x))$p.value))
HomP[counter1,6]=try(suppressMessages(carapeto_holt(list(y,x))$p.value))
HomP[counter1,7]=try(suppressMessages(cook_weisberg(list(y,x))$p.value))
HomP[counter1,8]=try(suppressMessages(diblasi_bowman(list(y,x))$p.value))
HomP[counter1,9]=try(suppressMessages(dufour_etal(list(y,x),
hettest="breusch_pagan")$p.value))
HomP[counter1,10]=try(suppressMessages(evans_king(list(y,x))$p.value))
HomP[counter1,11]=try(suppressMessages(glejser(list(y,x))$p.value))
HomP[counter1,12]=try(suppressMessages(godfrey_orme(list(y,x),
HomP[counter1,13]=try(suppressMessages(goldfeld_quandt(list(y,x))$p.value))
HomP[counter1,14]=try(suppressMessages(goldfeld_quandt(list(y,x),
method="nonparametric")$p.value))
HomP[counter1,15]=try(suppressMessages(harrison_mccabe(list(y,x))$p.value))
HomP[counter1,16]=try(suppressMessages(harvey(list(y,x))$p.value))
HomP[counter1,17]=try(suppressMessages(honda(list(y,x))$p.value))
HomP[counter1,18]=try(suppressMessages(li_yao(list(y,x))$p.value))
HomP[counter1,19]=try(suppressMessages(rackauskas_zuokas(list(y,x))$p.value))
HomP[counter1,20]=try(suppressMessages(simonoff_tsai(list(y,x))$p.value))
HomP[counter1,21]=try(suppressMessages(szroeter(list(y,x))$p.value))
HomP[counter1,22]=try(suppressMessages(verbyla(list(y,x))$p.value))
HomP[counter1,23]=try(suppressMessages(white(list(y,x))$p.value))
HomP[counter1,24]=try(suppressMessages(wilcox_keselman(list(y,x))$p.value))
HomP[counter1,25]=try(suppressMessages(yuce(list(y,x))$p.value))
HomP[counter1,26]=try(suppressMessages(zhou_etal(list(y,x))$p.value))

using H-values
Hugo Hernandez
ForsChem Research
for (k in 1:8){
if (k==1) hf=1+3*x
if (k==2) hf=1+10*(x-0.5)^2
if (k==3) hf=3.5-10*(x-0.5)^2
if (k==4) hf=1+3*sqrt(2*x)
if (k==5) hf=3/(1+3*x)
if (k==6) hf=1+5*log(1+x)
if (k==8) hf=1+10*dnorm(10*(x-0.5))
counter2=counter2+1
y=hf*y
HetP[counter2,1]=H.test(y,x)
HetP[counter2,2]=try(suppressMessages(anscombe(list(y,x))$p.value))
HetP[counter2,3]=try(suppressMessages(bamset(list(y,x))$p.value))
HetP[counter2,4]=try(suppressMessages(bickel(list(y,x))$p.value))
HetP[counter2,5]=try(suppressMessages(breusch_pagan(list(y,x))$p.value))
HetP[counter2,6]=try(suppressMessages(carapeto_holt(list(y,x))$p.value))
HetP[counter2,7]=try(suppressMessages(cook_weisberg(list(y,x))$p.value))
HetP[counter2,8]=try(suppressMessages(diblasi_bowman(list(y,x))$p.value))
HetP[counter2,9]=try(suppressMessages(dufour_etal(list(y,x),
HetP[counter2,10]=try(suppressMessages(evans_king(list(y,x))$p.value))
HetP[counter2,11]=try(suppressMessages(glejser(list(y,x))$p.value))
HetP[counter2,12]=try(suppressMessages(godfrey_orme(list(y,x),
HetP[counter2,13]=try(suppressMessages(goldfeld_quandt(list(y,x))$p.value))
HetP[counter2,14]=try(suppressMessages(goldfeld_quandt(list(y,x),
method="nonparametric")$p.value))
HetP[counter2,15]=try(suppressMessages(harrison_mccabe(list(y,x))$p.value))
HetP[counter2,16]=try(suppressMessages(harvey(list(y,x))$p.value))
HetP[counter2,17]=try(suppressMessages(honda(list(y,x))$p.value))
HetP[counter2,18]=try(suppressMessages(li_yao(list(y,x))$p.value))
HetP[counter2,19]=try(suppressMessages(rackauskas_zuokas(list(y,x))$p.value))
HetP[counter2,20]=try(suppressMessages(simonoff_tsai(list(y,x))$p.value))
HetP[counter2,21]=try(suppressMessages(szroeter(list(y,x))$p.value))
HetP[counter2,22]=try(suppressMessages(verbyla(list(y,x))$p.value))
HetP[counter2,23]=try(suppressMessages(white(list(y,x))$p.value))
HetP[counter2,24]=try(suppressMessages(wilcox_keselman(list(y,x))$p.value))
HetP[counter2,25]=try(suppressMessages(yuce(list(y,x))$p.value))
HetP[counter2,26]=try(suppressMessages(zhou_etal(list(y,x))$p.value))
}
}
save(counter1,counter2,HomP,HetP,file="HComp.R")
}
}
A.7. Comparative Evaluation: Second Stage
library(skedastic)
N=25
n=c(15,20,25,30,35,40,45,50,60,70,80,90,100,125,150,200,250,300,500,1000)
L=length(n)
HomP=matrix(NA,nrow=4*L*N,ncol=12)
HetP=matrix(NA,nrow=32*L*N,ncol=12)
counter1=0
counter2=0
for (j in 3:L){
for (i in 1:N){
set.seed(NULL)
x=runif(n[j])
yU=runif(n[j])
yN=qnorm(yU)
for (l in 1:4){
if (l==1) y=-log(1-yU)-1
if (l==2) y=yN^5

using H-values
Hugo Hernandez
ForsChem Research
if (l==3) y=exp(yN)-1
if (l==4) y=yN^7
counter1=counter1+1
HomP[counter1,1]=H.test(y,x)
HomP[counter1,2]=try(suppressMessages(anscombe(list(y,x))$p.value))
HomP[counter1,3]=try(suppressMessages(breusch_pagan(list(y,x))$p.value))
HomP[counter1,4]=try(suppressMessages(cook_weisberg(list(y,x))$p.value))
HomP[counter1,5]=try(suppressMessages(dufour_etal(list(y,x),
HomP[counter1,6]=try(suppressMessages(glejser(list(y,x))$p.value))
HomP[counter1,7]=try(suppressMessages(godfrey_orme(list(y,x),
HomP[counter1,8]=try(suppressMessages(harvey(list(y,x))$p.value))
HomP[counter1,9]=try(suppressMessages(simonoff_tsai(list(y,x))$p.value))
HomP[counter1,10]=try(suppressMessages(verbyla(list(y,x))$p.value))
HomP[counter1,11]=try(suppressMessages(white(list(y,x))$p.value))
HomP[counter1,12]=try(suppressMessages(zhou_etal(list(y,x))$p.value))
for (k in 1:8){
if (k==1) hf=1+3*x
if (k==2) hf=1+10*(x-0.5)^2
if (k==3) hf=3.5-10*(x-0.5)^2
if (k==4) hf=1+3*sqrt(2*x)
if (k==5) hf=3/(1+3*x)
if (k==6) hf=1+5*log(1+x)
if (k==8) hf=1+10*dnorm(10*(x-0.5))
counter2=counter2+1
y=hf*y
HetP[counter2,1]=H.test(y,x)
HetP[counter2,2]=try(suppressMessages(anscombe(list(y,x))$p.value))
HetP[counter2,3]=try(suppressMessages(breusch_pagan(list(y,x))$p.value))
HetP[counter2,4]=try(suppressMessages(cook_weisberg(list(y,x))$p.value))
HetP[counter2,5]=try(suppressMessages(dufour_etal(list(y,x),
HetP[counter2,6]=try(suppressMessages(glejser(list(y,x))$p.value))
HetP[counter2,7]=try(suppressMessages(godfrey_orme(list(y,x),
HetP[counter2,8]=try(suppressMessages(harvey(list(y,x))$p.value))
HetP[counter2,9]=try(suppressMessages(simonoff_tsai(list(y,x))$p.value))
HetP[counter2,10]=try(suppressMessages(verbyla(list(y,x))$p.value))
HetP[counter2,11]=try(suppressMessages(white(list(y,x))$p.value))
HetP[counter2,12]=try(suppressMessages(zhou_etal(list(y,x))$p.value))
}
}
save(counter1,counter2,HomP,HetP,file="HComp2.R")
}
}


Evaluating Scedasticity Using H-Values

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Evaluating Scedasticity Using H-Values

Uploaded by

Copyright:

Available Formats

Vol.

Evaluating Scedasticity using H-values

The term homoscedasticity1 was introduced by Karl Pearson in . Pearson defined a

In particular, the homoscedasticity of model residuals is an important assumption in most types

On the other hand, if

There are, however, two difficulties with this approach:

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

2. Tests for Scedasticity

2.1. Testing Data Classified in Multiple Groups

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

Then, the following statistic is obtained:

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

2.2. Testing Data within a Single Group

Eq. (2.8) can be alternatively expressed in terms of standard deviation ( ) as follows:

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

where is a random residual error.

However, notice that ( ) is a semi-bounded positive distribution [ ), whereas is

Alternatively, Cook and Weisberg [23] considered the multivariate function:

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

The corresponding statistic ( ) behaves as a distribution with degrees of freedom:

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

A simple, approximated empirical expression for calculating the coefficient of variation of

The system can be considered heteroscedastic if ( ) ( ). Of course, this is not a

3. H-value Test for Scedasticity

It is highly desirable that a method for testing scedasticity:

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

3.1. Rationale of the Method

Now, the condition for heteroscedasticity of with respect to in the range [ ]

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

3.2. R2 Distribution and P-value Calculation

 Different sample sizes are considered between and .

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

From these results we may conclude the following:

 Sample size has a strong effect on the distribution.

Figure 2 illustrates the behavior of selected percentiles as functions of sample size.

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

where ( ) is a sample size-dependent dispersion parameter. The values of the dispersion

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

where ( ) is approximately given by Eq. (3.24) and ( ) is approximately given by Eq.

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

The implementation of the proposed method for testing homoscedasticity in R language is

3.3. Power Assessment

 For each sample size, independent pairs of samples ( ) following a standard

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

The graphical behavior of these models is depicted in Figure 7.

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

3.4. Optimal Significance Level and H-value

The optimization procedure implemented in R is shown in Appendix 4. A gradient-based

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

Figure 8. Effect of sample size on homoscedasticity test errors

In these results we observe the following behavior:

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

If then the data is homoscedastic. If then the data is heteroscedastic. If , the

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

3.5. Testing for Homoscedasticity vs. Testing for Heteroscedasticity

However, we can also be particularly interested in testing either for homoscedasticity or

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16

Thus, let us define the following weighted test errors:

30/11/2023 ForsChem Research Reports Vol. 8, 2023-16