Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Ans 1.

Introduction:
The Bayes theorem is a statistical idea that enables us to update our prior beliefs solely in light
of new evidence. It is a useful tool that is widely used in a variety of industries, including
research, medicine, engineering, and finance, to name a few. The Bayes theorem can be used
to solve a problem related to periodontal disease and temper in this article.

Problem statement: According to the problem description, poor gums might lead to a negative
mood. Researchers discovered that 85% of people with a bad mood had periodontal disease, or
gum inflammation. Simply put, this condition affects 29% of healthy people. What is the
likelihood of a person having a low mood if they have a periodontal disorder? Horrible moods
are uncommon in a specific community, happening with a 10% chance.

Solution:

To solve this problem, we will use Bayes' theorem, which is : P(A|B) = P(B|A) * P(A) / P(B)

Where:

P (A|B) is the probability of event A occurring if event B has already occurred.


P (B|A) is the probability of event B occurring if event A occurs.
The prior probability of event A is denoted by P (A).
The prior possibility of occurrence B is denoted by P (B).

In this scenario, we want to calculate the likelihood of being in a negative mood if someone
has a periodontal disease. This can be represented using the notation:

P (bad mood | Periodontal disease)

We recognise that the earlier probability of having a bad temper is 0.1 (10%). We also learn
from the issue statement that the chance of having a periodontal disease in the presence of a
bad mood is 0.85 (85%), whereas the probability of having a periodontal disease in the absence
of a bad mood is 0.29 (29%). This data can be represented using a tree diagram, as illustrated
below:

Bad mood No bad mood


0.1 0.9
/ \ / \
Periodontal No periodontal Periodontal No periodontal
0.85 0.15 0.29 0.71

The first level of the tree diagram illustrates the previous possibility of having a bad or no bad
mood. In this situation, poor emotions are extremely infrequent, occuring only 10% of the time.
As a result, the probability of having a negative mood is 0.1, while the probability of not having
a terrible mood is 0.9.
The second level of the tree diagram indicates the likelihood of having periodontal disease
based on a bad or no bad mood. Given a bad temper, the probability of developing periodontal
disease is 0.85, according to the observation. In contrast, the probability of developing
periodontal disease in the absence of a negative mood is 0.29. As a result, the chance of
obtaining no periodontal disease given a terrible mood is 0.15, and the chance of getting no
periodontal disease with no bad mood is 0.71.

We can use the tree diagram to assess the likelihood of developing a bad mood and periodontal
disease, which corresponds to outcome 1. We can see from the tree diagram that the probability
of result 1 is:

P (terrible temper and periodontal disease) = P (bad mood) * P (Periodontal disease | bad mood)
= 0.1 * 0.85
= 0.0.5

We also recognise that the probability of developing periodontal disease is:


P (Periodontal sickness) = P (Periodontal disease | bad mood) * P (bad mood) + P (Periodontal
disease | No bad temper) * P(No bad temper)
= 0.85 * 0.1 + 0.29 * 0.9
= 0.346
We can now use Bayes' theorem to determine the likelihood of having a negative mood in the
presence of periodontal disease:
P (bad mood | Periodontal disorder) = P(Periodontal disease | bad mood) * P(bad mood) /
P(Periodontal disease)
= 0.85 * 0.1 / 0.346
= 0.2457

Consequently, the chance of having a bad temper given periodontal disease is 0.2457 or about
24.57%.

Finally, using Bayes' theorem and a tree diagram, we discovered that a person with periodontal
disease had a 24.57% chance of having a poor temper. This finding indicates a strong link
between periodontal disease and bad moods, with a considerably higher frequency of
periodontal disease among those who have had negative moods than those who have not. The
outcomes of this study highlight the importance of maintaining good dental hygiene and
seeking treatment for periodontal disease for oral fitness, overall well-being, and mental health.

Ans 2.

We must first enter the facts into the spreadsheet in order to develop a regression model in MS
Excel. We have information on the number of Instagram followers and the number of posts per
day for this circumstance. We will use the number of followers as the dependent variable and
the number of posts per day as the independent variable.

Here's how to create a regression version in Excel:

Step 1: Step 1: Enter the data in two columns, one for the dependent variable (variety of
followers) and one for the independent variable (range of posts per day).

Step 2: Click the data evaluation button on the statistics tab.

Step 3: Click appropriate after selecting "Regression" from the list of analytic tools.

Step 4: Enter the input range (the range of cells having the independent variable data) and the
output range (the range of cells containing the dependent variable facts) in the Regression
conversation box.
Step 5: Choose the options you'll need for your regression analysis in step 5. We can select
"Labels" for this example to include labels for our input and output variables, and "Residuals"
to compute the residuals. (the differences among the predicted and actual values).

Step 6: To start the regression analysis, click sufficient.

Excel will generate a brand-new sheet with the regression output after the regression evaluation
is finished. The regression equation, the R-squared coefficient of determination, the standard
error, and the t-values and p-values for the coefficients will all be included in the output.

The updated sheet is available, as seen below.

No of post per day(x) No of followers(y)


2 439
1 340
4 315
5 444
2 377
5 456
2 495
2 304
5 401
5 305
4 338
2 348
1 402
5 395

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.046617964
R Square 0.002173235
Adjusted R Square -0.080978996
Standard Error 62.9409903
Observations 14
ANOVA
df SS MS F Significance F
Regression 1 103.538016 103.538016 0.026135613 0.874259616
Residual 12 47538.81913 3961.56826
Total 13 47642.35714

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 377.2058212 38.39613846 9.82405618 4.33703E-07 293.5478221 460.8638203
No of post per day(x) 1.735966736 10.73804084 0.16166513 0.874259616 -21.66021441 25.13214789

The following table displays the findings of an Excel examination of a straightforward linear
regression. The table gives information on the regression coefficients, the statistical
significance of the version, and the applicability of the model. The number of followers (y) and
the variety of daily postings (x) are the established variables in this analysis.

Interpretation of Excel Tables:

Regression statistics:

R multiple

The correlation coefficient between the two variables shows a tenuous positive relationship
between followers and the variety of posts made each day. The correlation between the
variables is poor, as indicated by the value of 0.0466.

R squared :

That is the willpower coefficient, which demonstrates that 0.2% of the followers can be best
explained by the version in the range of daily posts. This low score indicates that more data
may need to be fit into the model.

R rectangle (adjusted):
Given the amount of variables in the model, the R-square price is modified. Since the miles are
negative in this instance, the model does not adequately account for the data.

Popular error:

The estimated residuals' standard deviation is given here. It implies the typical difference
between actual and predicted values.

ANOVA:

Information on the significance of the regression version is provided in the ANOVA table. The
F-statistic determines whether or not the regression model is large. The version may not be
meaningful in this situation if the F-statistic is very low.

Coefficients:

Information about the regression coefficients is provided in the coefficient table. The cost of
the dependent variable when the neutral variable is zero is represented by the intercept of
377.21. The coefficient of posts per day is 1.74, which implies that we may estimate an average
of 1.74 followers for each extra post made each day. Although this coefficient's p-value is
small, it does not significantly differ from zero.

The regression formula in this case is y = 377.21 + 1.74 x.

Conclusion:

According to the regression evaluation results, the number of posts per day might better predict
the number of Instagram followers. The low R-square, altered R-rectangular values, and poor
F-statistic imply that the version does not adequately represent the information.
This could be due to a variety of factors, such as the best postings or time, which must be
considered inside the model. As a result, additional investigation and attention to other
variables are required in order to develop a better and more accurate model.

Ans 3a.

In the provided problem, we must determine the time between replacements of 1000 light bulbs
installed in a new factory. We are also informed that the bulbs have an average lifespan of 120
days with a standard deviation of 20 days and that we cannot allow more than 10% of the bulbs
to expire before replacement.

Let X represent the bulb's lifespan in days, and X is normally distributed with a mean of 120
days and a standard deviation of 20 days.

We want to find the value x such that


P (X <= x) = 0.10.

We can find this value using the standard regular distribution table or a calculator.

First, we standardize X using the formula:


Z = (X - μ) / σ

Where μ is the mean and σ is the standard deviation.


Z = (x - 120) / 20

Using the standard regular distribution table, we find that the z-score corresponding to P (Z <=
z) = 0.10 is -1.28.

Substituting this value into the formula, we have


-1.28 = (x - 120) / 20

Solving for x, we get


x = -1.28 * 20 + 120 = 94.4
To ensure that no more than 10% of the bulbs expire before replacement, the period between
replacements should be 94.4 days or fewer.

To put it another way, the bulbs should be replaced every 94.4 days to ensure that no more than
10% of them have expired before being replaced. This is an essential criterion for ensuring the
proper operation of the production unit, as bulb failure could result in decreased output or safety
concerns.

It is critical to note that the ordinary distribution is a powerful data analysis tool that is
commonly utilised in statistical research. It has several important qualities, including symmetry
and a well-defined recommendation and standard deviation. These characteristics make it
useful for interpreting a wide range of statistical sets, from physical measurements to financial
data.

Because of its several critical properties, the normal distribution is a continuous opportunity
distribution that is commonly used in statistical analysis. A symmetrical bell-formed curve with
a single height at the suggestion and a well-defined propose and standard deviation are some
of the fundamental skills of the everyday distribution.

The distribution is unimodal, which means that it has only one mode and that its values unfold
arbitrarily on each aspect of the mean. The entire area below the curve of an ordinary
distribution is always the same to at least one, and its suggestion and standard deviation
completely determine it. These features make it an efficient tool for data analysis and
modelling.
Finally, we discovered that the interval between replacements for 1000 light bulbs with a mean
lifespan of 120 days and a recommended variation of 20 days must be 94.4 days or fewer in
order to assure that no more than 10% of the bulbs expire before replacement. This method
made use of the average and standard regular distribution tables, both of which are important
statistical analysis tools.

Ans 3b.
Age group (C I) Mid point (x) Male (f) f*x
0-4 2 98,34,738 19669476
5-9 7 1,09,59,506 76716542
10--14 12 1,24,25,108 149101296
15-19 17 1,26,83,733 215623461
20-24 22 1,31,97,283 290340226
25-29 27 1,30,45,214 352220778
30-34 32 1,21,34,009 388288288
35-39 37 1,20,60,030 446221110
40-44 42 1,09,00,143 457806006
45-49 47 97,04,026 456089222
50-54 52 79,40,152 412887904
55-59 57 61,61,754 351219978
60-64 62 54,01,736 334907632
65-69 67 36,87,082 247034494
70-74 72 26,62,421 191694312
75-79 77 13,41,572 103301044
80-85 82.5 14,61,296 120556920
Sum= 14,55,99,803 4,61,36,78,689

Age group (C I) Mid point (x) Female(f) f*x


0-4 2 91,27,975 18255950
5-9 7 99,58,059 69706413
10--14 12 1,14,51,227 137414724
15-19 17 1,65,18,666 280817322
20-24 22 3,36,58,466 740486252
25-29 27 3,75,22,017 1013094459
30-34 32 3,42,86,096 1097155072
35-39 37 3,30,54,887 1223030819
40-44 42 2,72,61,236 1144971912
45-49 47 2,34,47,716 1102042652
50-54 52 1,78,42,986 927835272
55-59 57 1,51,92,910 865995870
60-64 62 1,43,47,372 889537064
65-69 67 1,01,41,196 679460132
70-74 72 70,33,728 506428416
75-79 77 34,93,001 268961077
80-85 82.5 42,53,695 350929837.5
Sum= 30,85,91,233 11,31,61,23,244
We must divide the total of the goods at the midpoint of each age group and the number of
migrants in that group by the full range of migrants in that category in order to determine the
average age of migrants for both gender categories.

The overall number of migrants in the male population is 1,455,99803, and the total f*x (the
product of the midpoint of each age group and the number of migrants in that group) is
4,613,678,689 people. Therefore, the following formula can be used to determine the average
age of male migrants:

(Sum of f*x) / (Total number of migrants) = 4,613,678,689 / 1,455, 99803 = 31.687 for male
migrants.

As a result, the median age of male migrants is almost 31.69 years.

The overall number of female migrants is 30 859 1233, and there are 11,316,123,244 f*x.
Therefore, the following formula can be used to determine the average age of female migrants:

Average female immigrant age equals (Sum of f*x) / (Total immigrant number) =
11,316,123,244 / 30, 859, 1233 = 36.670

Consequently, the average age of female migrants is 36.67 years old.

Interpretation:

According to the above figure, the average age of male migrants is approximately 31.687 years,
while the average age of female migrants is approximately 36.670 years. This suggests that the
female migrant population is just a few years older than the male migrant population. The age
gap can be caused by a variety of factors.

It could be because females relocate for family reunion or to join up for their spouse at a later
age than men who migrate for work-related reasons. Furthermore, women have a proclivity to
survive in males, which may be another reason why the legal age of female migrants is higher.
Knowing the average age of migrants for each gender is critical because it allows policymakers
to create standards that respond to the requirements of various age groups. For example, if the
legal age of female migrants is raised, policymakers may choose to prioritise resources that
cater to the needs of older people, such as healthcare facilities, social safety net programmes,
and age-friendly housing. Furthermore, suppose the average age of male migrants is reduced.
In that instance, governments may focus on expanding career options that correspond to the
desires of younger people, such as internships, training programmes, and apprenticeships.

Finally, male migrants have an average age of 31.69 years, while female migrants have an
average age of 36.67 years. This age disparity will be caused by a variety of variables, including
the motives for migration and the gender gap in life expectancy. Understanding the average
age of migrants by gender is critical to facilitating their effective integration into host society
and improving their overall well-being. This data is critical for policymakers to create packages
and policies that address the requirements of different age groups, such as providing healthcare
facilities for older female migrants and developing job possibilities for younger male migrants.

You might also like