Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

Task 3

Binomial distribution
Introduction
It was introduced in a journal that the binomial distribution is a form of
probability distribution in which it summarizes the likeness that a particular amount
that will get one of two independent values under a condition or set of parameters or
assumptions. There are such assumptions on the binomial distribution that there is
only one outcome for each trial and therefore each of these trials is mutually exclusive
or independent of one another. In statistics, the binomial distribution is one of the
commonest discrete distributions different from a continuous distribution, like the
normal distribution. The reason why the idea of the binomial distribution is opposing
to the normal distribution is that it counts only two states and this represents a 1 (for
success), or a 0 (for failure) that is given on the trials represented in the data.
Moreover, the binomial distribution represents the x and its probability for success in
the trials (n), given also a success probability (p) for each trial (Barone, 2021).
Symmetry Property of Binomial distribution
It has been discussed in an article released by Stattrek (2022) that the mean of
the binomial distribution is (p) and its standard deviation is represented in sqr (p(1-
p)/n). The shape of a binomial distribution is said to be symmetrical when p=0.5 or
when the (n) is large.
Finding the Mean, the Variance, and the Standard Deviation of this Distribution
To determine the mean of the binomial distribution, it is important to
understand that the mean of the distribution (μx) is equal to n*P, then the variance
(σ2x) is n*P*(1-P), and the standard deviation (σx) is computed by sqrt [n *P * ( 1 - P )
] (StatsDirect, 2021).
Simple Example of Binomial Distribution
According to a study by the Corporate Finance Institute (2020), there are
several types of binomial distribution trials. The first is fixed trials, such as coin flips,
where the number of times a trial has been performed is recorded since the beginning.
If a coin is flipped ten times, each flip of the coin is referred to as a trial. Second,
there are independent trials, such as tossing a coin or rolling a die, where the first
event, in the case of tossing the coin, is considered to be independent of the
subsequent events. Third, the fixed chance of success, as demonstrated by the fact that
when a person tosses a coin, the likelihood of receiving a head is approximately 0.5,
Task 3

and if 50 trials are conducted, the expected value of the number of heads is 25.
(50x0.5).

Exercise 66
Introduction
Many firms use acceptance sampling as a quality control method to monitor
incoming shipments of various parts, raw materials, and other items. Parts are
frequently sent in huge batches from suppliers in the electronics industry. The n trials
of a binomial experiment are represented by a sample of the components indicated by
n. The expected outcome for each component evaluated (trial) will be a determination
of whether the component is good or defective. Reynolds Electronics accepts a lot
from a specific supplier if the number of defective components in the lot does not
exceed 1%. In the problem, there are supposed five items that are recorded as the
samples from a recent shipment.
Problem
a. Assume that 1% of the shipment is defective. Compute the probability that no
items in the sample are defective.
(Probability of Defective) P (D) = 1% = 0.01%
(Probability of Non-Defective) P (D’) = 1-1% = 0.99
P (A) = 5co (0.01)0(0.99)5 + 5c1(0.01)1(0.99)4
(0 defective) (1 defective)

P (A) = 0.959 = 95.9%


b. Assume that 1% of the shipment is defective. Compute the probability that
exactly one item in the sample is defective.
(Probability of Defective) P (D) = 1% = 0.01%
(Probability of Non-Defective) P (D’) = 1-1% = 0.99
P (A) = 5co (0.01)0(0.99)5 + 5c1(0.01)1(0.99)4
(0 defective) (1 defective)

P (A) = 0.998 = 99.8%


c. What is the probability of observing one or more defective items in the sample if
1% of the shipment is defective?
(Probability of Defective) P (D) = 1% = 0.01%
(Probability of Non-Defective) P (D’) = 1-1% = 0.99
P (A) = 5co (0.01)0(0.99)5 + 5c1(0.01)1(0.99)4 + 5c2(0.01)2(0.99)3
Task 3

(0 defective) (1 defective) (2 defective)

P (A) = 0.998 = 99.8% (1 defective)


P (A) = 1,017 = 101.7% (2 defectives)
d. Would you feel comfortable accepting the shipment if one item was found to be
defective? Why or why not?
It’s never easy to determine the sureness of the safety of the products during
the shipment process but when it comes to the issue of defectiveness, it’s more likely
serious than any problems upon receiving a package or a parcel and therefore it will
be very uncomfortable to receive such defective item. When you pay for a particular
product, you also pay for the usage of packaging and other types of protective cover
to the product during the shipment and delivery process such as the bubble wrap and
of course, it is the responsibility of the seller to check the products and of course, see
if it is working or not before packaging, shipment, and delivery. It is a very hassle part
for a recipient to find time to ship back the defective item to the seller or the sender of
the product and even pay more than his/her desired amount of product due to the case
that the product is defective and it needs to be shipped back to the seller.

Sign Nonparametric Test


Introduction
The term sign test has been defined as a non-parametric test used to determine
whether two sets of data are of equal size. When dependent samples are identified that
are ordered in pairs and the bivariate random variables are independent in their ways,
the test is employed. The sign test is based on the direction of the plus and minus
signs rather than the numerical magnitude of the numbers. It's also known as the p =
0.5 binomial sign test. The sign test is considered a weaker test by statisticians
compared to other tests since it evaluates the pair value below or above the median
rather than computing the pair difference (Statistics Solution, 2021).
The objective of the Sign Nonparametric Test and Forming of Hypothesis
In the first type of sign test, the one sample, the hypothesis is made through
the data sample of the problem being shown which targets the + and - signs as the
values of the random variables having equal size. However, on the paired sample, this
is explained as an alternative to the paired t-test and this uses the + and - signs in the
paired sample tests or the before-after study presented. The null hypothesis is being
Task 3

made up so that the signs of + and - are equal in size or the population means are
equal to the given sample mean (Statistics Solution, 2021).
Testing of the Hypothesis using Binomial Distribution for both Non-Directional
and Directional Cases
In the journal released by Zach (2021), it has been
explained that the directional hypothesis is also known as an
alternative hypothesis that contains less than (represented
by the sign “<”) or those greater than (represented by the
sign “>”). however, when it comes to the non-directional
hypothesis, this is an alternative hypothesis containing the
not equal (represented by the sign “≠”).
1. A survey of 40 home prices in a metropolitan area has the following results:
Price Less than Equal to $200,000 More than $200,000
$200,000
Number of 13 1 27
Homes

a. Test the hypothesis that the median price in the metropolitan area is $200,000.
x̄ 1= 13
x̄ 2= 1
x̄ 3= 27

Above are the values of the mean of every number of homes given in
the table. The symbol x̄ represents the mean of every data in each column
provided.
a. Ho = (the median price in the metropolitan area is $200,000)
b. Ha= (the median price in the metropolitan area is not $200,000)
Test Statistic: t-test
Since this is a t-test and there are multiple samples, the groups are
divided into three parts which are groups 1& 2, groups 1&3, and groups 2 & 3.
Moreover, the alpha which is 0.05 will also be divided into 3 since the groups
for the t-test are subdivided into 3, the alpha is now 0.0166.
Conclusion
Task 3

The first group had yielded a P(T<=t) one-tail of 0.0285 and since this is not
less than 0.0166, then there is no significant difference in the responses of the groups
1 & 2 which are those less than and equal to $200,000.

The second group had yielded a P(T<=t) one-tail of 0.3037 and since this is
not less than 0.0166, then there is no significant difference in the responses of the
groups 1 & 3 which are those less than and more than $200,000.
The third group had yielded a P(T<=t) one-tail of 0.1878 and since this is not
less than 0.0166, then there is no significant difference in the responses of the groups
2 & 3 which are those equal and more than $200,000.

Since the group equal to $200,00 yields the middle amount for the data, this
means that although there is no significant difference in the p-value of these groups, it
is also concluded that the median is equal to $200,000.

b. Test the hypothesis that the median price in the metropolitan area is more than
$200,000
x̄ 1= 13
x̄ 2= 1
x̄ 3= 27

Above are the values of the mean of every number of homes given in
the table. The symbol x̄ represents the mean of every data in each column
provided.
c. Ho = (the median price in the metropolitan area is more than $200,000)
d. Ha= (the median price in the metropolitan area is not more than $200,000)
Test Statistic: t-test
Since this is a t-test and there are multiple samples, the groups are
divided into three parts which are groups 1& 2, groups 1&3, and groups 2 & 3.
Moreover, the alpha which is 0.05 will also be divided into 3 since the groups
for the t-test are subdivided into 3, the alpha is now 0.0166.
Conclusion
Task 3

The first group had yielded a P(T<=t) one-tail of 0.0285 and since this is not
less than 0.0166, then there is no significant difference in the responses of the groups
1 & 2 which are those less than and equal to $200,000.

The second group had yielded a P(T<=t) one-tail of 0.3037 and since this is
not less than 0.0166, then there is no significant difference in the responses of the
groups 1 & 3 which are those less than and more than $200,000.

The third group had yielded a P(T<=t) one-tail of 0.1878 and since this is not
less than 0.0166, then there is no significant difference in the responses of the groups
2 & 3 which are those equal and more than $200,000.

Since the group 2 & 3 which represents values equal and more than $200,00 yields
the P(T<=t) of 0.1878, this means that since there is no significant difference in the p-
value of these groups, it can be concluded that the median value is more than
$200,000.

ɑ = 0.05 in both cases. Include hypothesis formulation, use the binomial


approach, and explain your calculations in detail. Include p values in both
cases.
Regression Analysis 
Introduction

This graph depicts the historical performance of S&P Global, a division of


S&P Dow Jones Indices, from the year 2012 through the most recent data reported
this year, 2022. The S&P 500 Index, or Standard & Poor's 500 Index, is a
Task 3

capitalization-weighted index of 500 of the country's most renowned publicly traded


firms. Because the index includes additional criteria, it is not a complete ranking of
the top 500 companies in the United States by market capitalization. Nonetheless, the
S&P 500 index is largely regarded as one of the most reliable gauges of the
performance of big American stocks, and consequently the stock market in general.
The S&P only considers free-floating shares or those that can be traded by the general
public for calculating market capitalization. The S&P adjusts each company's market
cap to account for new share offerings or mergers. The value of the index is calculated
by multiplying the adjusted market capitalization of each firm by a divisor. The S&P
divisor is kept private and is not available to the public (Kenton, 2022).

This table shows the significant changes and consistencies in the year
performances of the S&P. The total return is a method of evaluating all profits from a
transaction by taking into account both price appreciation and income production over
a specific period, usually a year, as defined by Lavrakas (2021). The total return,
sometimes called the total rate of return, is usually represented as a percentage and
allows you to compare the total return performance of various assets. In its most basic
form, a return, often known as a financial return, is the amount of money gained or
lost on a business over time, as defined by Investopedia (2021). A return can be
calculated using the change in the monetary value of an investment over time. The
profit-to-investment ratio can also be used to compute a return as a percentage. The
table also includes the Net Total Return, which is defined by Law Insider (n.d.) as the
performance of an index's underlying portfolio after any tax has been deducted,
including dividends, interest, and other income reinvested.
Task 3

40

30

20

10

0
2010 2012 2014 2016 2018 2020 2022

-10

-20

Total Return Price Return Net Total Return

The scatter plot has been prepared to show the relations between the data from
the collected year performances from the year 2012 to 2020.
40

30

20

10

0
2010 2012 2014 2016 2018 2020 2022

-10

-20

Total Return Price Return Net Total Return

This is a type of linear graph in which data are also seen and are more
emphasized when it comes to its increasing and decreasing plots. In performing such
regression analysis of data, the hypothesis may happen in a way that proves that there
is a significant change in the collected yearly data of S&P and is represented by the
null hypothesis that process that there is no significant change in the collected yearly
data of S&P.
To perform the successful computation of the regression analysis, several
methods have been used such as Google Sheets and the Add ons focusing on
Task 3

statistical computations. Below are the summaries of the computed regression


analysis, scatterplot graphing, extent of residuals, and normality of residuals.
SUMMARY
OUTPUT

Regression
Statistics

Multiple R 0.9999999449

R Square 0.9999998898

Adjusted R
Square 0.9999998531

Standard
Error 0.006577924769

Observation
s 9

ANOVA

df SS MS F Significance F

Regression 2 2356.232629 1178.116315 27227662.94 0

0.000259614565
Residual 6 6 0.00004326909427

Total 8 2356.232889

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95%

- 0.0180561185 -
Intercept -0.02779486104 0.01873830464 -1.483317812 0.1885145325 0.07364584065 8 0.07364584065

32.55 -0.4492053359 0.01287646466 -34.88576622 0.00000003696640574 -0.4807129098 -0.417697762 -0.4807129098

34.29 1.44923675 0.01272170934 113.9183982 0 1.418107849 1.480365651 1.418107849

RESIDUAL PROBABILITY
OUTPUT OUTPUT

Observation Predicted 35.04 Residuals Percentile 35.04

1 -1.693205919 0.003205919328 5.555555556 -13.03

2 32.1285492 -0.008549199257 16.66666667 -1.69

3 -13.03163345 0.001633445999 27.77777778 -1.53

4 22.17668949 0.003310507481 38.88888889 15.2

5 22.79392581 0.006074189344 50 22.18

6 -1.522449835 -0.007550164976 61.11111111 22.8

7 15.20299469 -0.002994688966 72.22222222 28.82

8 35.63245852 -0.002458515431 83.33333333 32.12

9 28.81267149 0.007328506386 94.44444444 35.63


Task 3

This data represented had shown the results from the computation of the
valuable points in the regression analysis such as the regression statistics, the
ANOVA, the Coefficients, Standard error, t stat, P-value, residual outputs, and the
probability outputs which then proves the null hypothesis that there is a significant
change in the collected yearly data of S&P since the P-value of the data represents a
total of more than the alpha of 0.05 (as standard) which proves that the data is
changing over time (Analytics Vidhya, 2021).

The residual plot is a graph with the residuals on the vertical axis and the
independent variable on the horizontal axis. If the dots in a residual plot are randomly
distributed across the horizontal axis, a linear regression model is appropriate for this
study; otherwise, a nonlinear model appears to be more appropriate (Stattrek, n.d.).

A normal probability plot is a graphical tool for determining if a data


collection is evenly distributed. The points create an essentially horizontal line when
Task 3

plotted against a theoretical normal distribution. Deviations from the norm are
indicated by deviations from the straight line (National Institute of Standards &
Technology, n.d.).
Regression analysis indeed helps with the computation of values such as the p-
value and the coefficients computed for the plotting of data, however, another mode
of computation might also be suitable such as a t-test. The influence of the data is
positive in way that it can be visited by almost all people as long as they search for it
on the web, however, it is also an adverse situation to be part of the people involved
specifically when there are these computed low returns and that this is one of the
challenges in the field of trading and other financial assets.
Conclusion
Since the data from the graph shows a changing height of the line instead of
just a straight line, this means that the data is not consistent and this means that the
first rule which is the adequacy check in the regression analysis is not that achieved
through the given data. However, there is one thing confirmed in the analysis, which
is the significant change in the provided data set from the year’s performances.
Task 3

References
Barone, A. (2021, October 9). How binomial distribution works. Investopedia.
https://www.investopedia.com/terms/b/binomialdistribution.asp
Binomial distribution. (2022). Statistics and Probability.
https://stattrek.com/probability-distributions/binomial.aspx
Binomial distribution - StatsDirect. (2021). StatsDirect Statistical Analysis Software.
https://www.statsdirect.com/help/distributions/binomial.htm
Corporate Finance Institute. (2020, May 14). Binomial distribution - Definition,
criteria, and example.
https://corporatefinanceinstitute.com/resources/knowledge/other/binomial-
distribution/
Zach. (2021, April 29). What is a directional hypothesis? (Definition & examples).
Statology. https://www.statology.org/directional-hypothesis/
Sign test. (2021, August 2). Statistics Solutions.
https://www.statisticssolutions.com/free-resources/directory-of-statistical-analyses/
sign-test/
Kenton, W. (2022, February 15). S&P 500 index. Investopedia.
https://www.investopedia.com/terms/s/sp500.asp
Lavrakas, T. (2021, July 16). Understand the total value of your investments with a
total return. Forbes Advisor. https://www.forbes.com/advisor/investing/what-is-total-
return/
Return. (2003, November 25). Investopedia.
https://www.investopedia.com/terms/r/return.asp
Net total return definition. (n.d.). Law Insider.
https://www.lawinsider.com/dictionary/net-total-return
Everything you need to know about hypothesis testing in machine learning. (2021,
September 9). Analytics Vidhya.
https://www.analyticsvidhya.com/blog/2021/09/hypothesis-testing-in-machine-
learning-everything-you-need-to-know/
Residual analysis in regression. (n.d.). Statistics and Probability.
https://stattrek.com/regression/residual-analysis.aspx
National Institute of Standards & Technology. (n.d.). Normal probability plot.
https://www.itl.nist.gov/div898/handbook/eda/section3/normprpl.

You might also like