Decision-Science - Assign

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Decision Science

Ans 1.

Introduction

Probability, a mathematical and statistical concept that is fundamental for both, addresses the
probability of an event to happen. It is a measure of the probability that a specific event will
happen, expressed as a number ranging between 0 and 1, where 0 indicates impossibility and 1
signifies certainty. Probability is a key factor in everyday decisions when evaluating risks, as
well as analysing uncertainties. The concept of probability can be applied to many fields such as
insurance, finance, medicine, sports, and many more.

A tree diagram is a well-organized and systematic way to see probabilities of events and resolve
probability issues. It helps calculate the probability of each possible outcome. Tree diagrams are
especially useful for conditional probabilities. These are the probabilities that a particular event
will happen even if an event already occurring.

In this case we are given information about the connection between periodontal illness and low
moods. We're also asked to calculate the likelihood that someone has a periodontal issue who is
in a low mood. The tree diagram method allows us to break the issues into smaller parts and to
visualize the relationship between events.

Concept and application

To better understand the concept of probability and tree diagrams more clearly to better
understand the concept, let's define some key terms:

Experimentation: An action or procedure that results in outcomes, e.g., tossing the coin rolling
a die etc.
Space for sample: The set of the possible outcomes of an experiment.

Events: An event is the subset of a sample space that represents an occurrence.

Probabilities of conditional: Probability of an event occurring if a similar event has already


taken place.

The Bayes theorem is a mathematical description of the relationship of probabilities, especially


in relation to conditional probabilities. The Bayes theorem, named in honor of Reverend Thomas
Bayes (18th century English mathematician and statistician), provides a method to update or
modify existing beliefs based on the latest evidence or data. It is frequently used in various fields
such as finance, medicine machine learning, as well as artificial intelligence, to draw inferences
and predictions that are based on insufficient or unclear data.

Bayes' theorem can be explained mathematically as follows:

P(A|B) = (P(B|A) * P(A)) / P(B)

Where:

P(A|B) is the conditional probability of event A occurring, given that event B has occurred.

P(B|A) is the conditional probability of event B occurring, given that event A has occurred.

P(A) and P(B) are the probabilities of events A and B occurring independently.

The method we will employ is conditional probability in order to solve the problem. This is the
likelihood that an event occurs in the event that another event has occurred. It is our job to figure
out the probability of someone to have a negative attitude when they suffer from periodontal
disease.

Define the following terms:

H: Bad mood

D: Periodontal disease
The following probabilities are available to us.

The prior probabilities are,

P(H) = 0.1

The complementary is,

P(Hc) = 1-P(H)

= 1 – 0.10

= 0.90

The posterior probabilities are,

P(D/H) = 0.85

P(D/Hc) = 0.29

A selection tree may also be used to represent all the other stats.

TREE DIAGRAM
If someone has periodontal disease, we need to assess the risk that they'll suffer an attack on the
heart.

We must locate P(H).

The Bayes's Law formula is derived by substituting the likelihood and prior probabilities
(posterior probabilities) into it.

P(D/H) = [P(D/H) * P(H)] / [P(D/H) * P(H) + P(D/Hc) * P(Hc)]

= [(0.85) * (0.10)]/ [(0.85) * (0.10) + (0.29) * (0.90)]

= 0.085/0.346

= 0.246

The probability that the person will be in a negative mood in the event of periodontal disease, is
24.56%.
Bayestheorem, a powerful principle in the field of probability theory and statistical analysis that
allows us to review and update our expectations and assumptions using new data. The Bayes'
theory is a powerful and versatile principle in probability theory as well as statistics that permits
us to modify or update our assumptions and probabilities based on new evidence.

In our response that we came up with, we used Bayes' theorem to calculate the likelihood of
being in bad moods when someone has periodontal disease, arriving at the same result as we did
with the tree diagram technique. It is evident that Bayes Theorem can be used to solve diverse
probability issues.

As we continue to encounter difficult situations and make decisions on the basis of limited
knowledge understanding and implementing principles such as Bayes theorem are becoming
increasingly important. These powerful statistical tools assist us in assessing and quantify the
probabilities of outcomes and events.

Conclusion

The idea and tree diagrams in this problem to assess how likely it is for someone to have a bad
attitude in the event of a periodontal problem. The tree diagram was helpful to visualize the
relationship between poor moods to periodontal diseases and to break down the issues into
smaller parts. We calculated the probability of a condition using the tree chart and discovered
that the likelihood that one will exhibit a negative attitude when they suffer from periodontal
disease was around 24.56 percent.

Tree diagrams and probabilities can be useful tools to understand and solve difficult problems
involving uncertainty and dependent relationships. They are useful in many fields like finance,
decision-making and medicine as they offer a clear, systematic method to analyze and calculate
probabilities. Through understanding these concepts, we will be able to better comprehend the
probability of happenings in our lives, and take informed decisions based on the probabilities of
different outcomes.
Ans 2.

Introduction

In this age of social media, Instagram has emerged as an excellent platform for both individuals
and businesses to showcase their work and create a strong online presence. The amount of
Instagram followers can have a major effect on a user's popularity and reach. Many factors
influence the development of the Instagram following, among that is the frequency at which
users post content. In this study we will analyze the relationship between the frequency of posts
posted per day (independent variable) and the number of followers on Instagram (dependent
variable) employing a regression analysis using Microsoft Excel.

A regression analysis is a method of statistical analysis utilized to analyze the relationship


between two or more variables. In this example, we want to determine whether there's a linear
relationship between the amount and type of posts on Instagram. It is possible to predict the
impact on followers' growth from posting frequency by discovering this relationship. This could
be beneficial for Instagram users who want to improve their strategy.

Concept and application

Here's how to make Regression versions in Excel:

Step 1: Input the records in two columns. The records will be inserted with the independent
variable (range of posts per day) in one column and the dependent variable (variety of followers)
in a different column.

Step 2: Click on the statistics tab.

Step 3: Choose "Regression" in the listing of the analysis equipment and click appropriate.

Step 4: Inside the Regression conversation box, type in the input variety (the number of cells
that contain the independent variable data) and the output variety (the array of cells containing
the dependent variable facts).
Step 5: Choose the options you want to use for the regression analysis. For this particular
example, we can pick "Labels" to encompass labels for our input and output variables and
"Residuals" to determine the residuals (the variations between the predicted and actual value).

Step 6: Click the appropriate button to run regression analysis.

Excel creates an entirely new worksheet containing the output of the regression when the test is
complete. The output will include the equation used to calculate the regression, its coefficient of
determination (R squared) as well as the standard error, the t values and p values for the
coefficients.

Regression model

Regression Statistics

Multiple R 0.11059024

R Square 0.0122302

Adjusted R Square -0.07756705

Standard Error 63.0282592

Observations 13

ANOVA

Particulars df SS MS F Significance F

Regression 1 541.0547129 541.05471 0.1361979 0.719095592

Residual 11 43698.17606 3972.5615

Total 12 44239.23077
Partic Coeffi Standard t P- Lower Upper Lower Upper

ulars cients Error Stat value 95% 95% 95.0% 95.0%

Intercept 365.021 40.397 9.036 0.000 276.108 453.934 276.108 453.934

2 4.063 11.010 0.369 0.719 -20.170 28.297 -20.170 28.297

Regression line

No of No. of post X – Mean (X - Mean Y - Mean (X - Mean of X) *

followers (Y) per day (X) of X of X) ^2 of Y (Y - Mean of Y)

439 2 -1.2143 1.4745 56.2143 -68.26020408

340 1 -2.2143 4.9031 -42.7857 94.73979592

315 4 0.7857 0.6173 -67.7857 -53.26020408

444 5 1.7857 3.1888 61.2143 109.3112245

377 2 -1.2143 1.4745 -5.7857 7.025510204

456 5 1.7857 3.1888 73.2143 130.7397959

495 2 -1.2143 1.4745 112.2143 -136.2602041

304 2 -1.2143 1.4745 -78.7857 95.66836735

401 5 1.7857 3.1888 18.2143 32.5255102

305 5 1.7857 3.1888 -77.7857 -138.9030612


338 4 0.7857 0.6173 -44.7857 -35.18877551

348 2 -1.2143 1.4745 -34.7857 42.23979592

402 1 -2.2143 4.9031 19.2143 -42.54591837

395 5 1.7857 3.1888 12.2143 21.81122449

5359 45 0 34.3571 0 59.6429

Interpretation of Excel Tables:

Regression statistics:

Multiple R: The coefficient of correlation between the two variables shows an unsubstantial
positive relationship between the different ranges of posts per day and the number of followers.
A value of 0.11059 suggests a weak correlation.

R square: That is the coefficient of willpower and shows that the versions in the range of each
day's posts could explain at least 1.2 percent of the diversity of followers. The low figure
suggests the model might require additional information.

Adjusted rectangular R: The R-square's price is adapted to the amount of variables included in
the model. In this case, it miles negative, indicating the model is not fitting the information well.

Popular Error: This estimate is the standard variation of residuals. It is the distance between the
real values and the predicted values.

ANOVA: The ANOVA Table gives information about the significance of regression version.
The F statistic determines if the model used in regression is massive or is no longer. In this
instance, the F-statistic can be low and suggest that the model of the model isn't important.

N=14
Sum of X = 45

Sum of Y = 5359

Mean X = ∑X/N = 45/14 = 3.2143

Mean Y = ∑Y/N = 5359/14 = 382.7857

Sum of squares (SSX) = 34.3571

Sum of products (SP) = 59.6429

Regression Equation = ŷ = bX + a

b = SP/SSX = 59.64/34.36 = 1.73597

a = MY - bMX = 382.79 - (1.74*3.21) = 377.20582

ŷ = 1.73597X + 377.20582

The estimated increase in Instagram followers for each increase in posts per day is 1.736. of post
per day. Its slope is positive meaning that the relation is linear, which means as one gets bigger
of post, the other increases.

Conclusion

The regression analysis that was conducted using Microsoft Excel shows that there is a
correlation between the frequency and quantity of Instagram posts. The positive coefficient of
the number per day of posts suggests that an increase in the frequency of posts could result in a
growth of followers. Additionally, the low p value suggests that the amount of posts per day is an
important indicator of the number of Instagram followers. This supports our initial hypothesis.

It is important to note the limitations of this study. This study used a relatively tiny sample,
however other aspects, such as the quality of the content, audience engagement, and the use of
hashtags are also likely to influence the quantity of Instagram users. To get a better
understanding of what influences contribute to Instagram expansion, it could be necessary for
further research to use larger datasets and more factors.
Ans 3a.

Introduction

Normal distributions usually have positive standard deviations. The graph's symmetry is
determined by the mean. The standard deviations indicate how far the data are distributed. A
small standard deviation indicates that the data are close together, and so the graph gets
narrower. It means that the data are distributed more evenly, and the graph is broader. The size of
graphs that is under the normal line is divided into standard errors, which represent how much
data falls in each subdivided segment.

Concept and application

It is necessary to find out the time interval between 1000 light bulbs that have been installed in a
brand-new factory. The bulbs have the duration of 120 days, with the standard deviation of 20
days. It is not possible to let greater than 10% of the bulbs to be used up before replacement.

If X represents the life span of a bulb in days, the bulb's lifespan will be dispersed. The mean is
120 days, with an average deviation of 20 days.

We're trying to find the value x that is the equivalent to

P (X <= x) = 0.10.

This value can be derived with the normal distribution table standard or a calculator.

The first step is to standardize X using the formula:

Z = (X - μ) / σ

Where μ is the mean and σ is the standard deviation.

Z = (x - 120) / 20

We see from the standard distribution table according to the regular distribution chart z-score
corresponding to P (Z <= z) = 0.10 is -1.28.
In the event that we substitute this value into the formula we get

-1.28 = (x - 120) / 20

Solving for x, we get

x = -1.28 * 20 + 120 = 94.4

So, the time between replacements is 94.4 days or less, to ensure that at least 10 percent of the
bulbs are depleted before replacement.

In addition, bulbs should be replaced at intervals of 94.4 to ensure that they're not more than 10%
old. It is important to maintain the proper functioning of the unit as a bulb's failure can lead to a
compromised production, or a safety concern.

It is widely used for its efficient technique of statistical analysis. Its symmetry and its suggestion
and standard deviation are both essential. It can be used to analyze a wide range of statistics,
from financial data to physical measurement.

The normal distribution is utilized in statistics due to its many important qualities. The daily
distribution is symmetric with a single peak at the point of suggestion, and a an established
standard deviation. It also proposes.

Unimodal distributions possess a single mode and their values are arranged frivolously on each
side of the mean. It's the same area below a normal curve to at most one. The standard deviation
and suggestion define this. It is a powerful instrument for analyzing and modeling data.

Conclusion

The interval between changing 1000 lightbulbs that have a life span of 120 and a preferred
variation of 20 must be minimum 94 days in order to avoid more than 10% of them from
expiring prior to replacement. This calculation was made using the standard regular distribution
and average table, both of which are important tools for statistical analysis.
Ans 3b.

Introduction

A single average is an immense amount of data. They are determined by the ratio between the
number of values that make up a set and the quantity of those values in one unit. To calculate the
average age for migrants of both genders We divide the total of the goods that are at the midpoint
in each age class as well as the number of people who fall within that group by the number of
people who are in that category.

Concept and application

Calculate the average age of migrants

Male (M) Mid value d = (x-A)


Age group
(f) (X) h f*d

0-4 9834738 2 -8 -78677904

5-9 10959506 7 -7 -76716542

10-14 12425108 12 -6 -74550648

15-19 12683733 17 -5 -63418665

20-24 13197283 22 -4 -52789132

25-29 13045214 27 -3 -39135642

30-34 12134009 32 -2 -24268018

35-39 12060030 37 -1 -12060030

40-44 10900143 A = 42 0 0
45-49 9704026 47 1 9704026

50-54 7940152 52 2 15880304

55-59 6161754 57 3 18485262

60-64 5401736 62 4 21606944

65-69 3687082 67 5 18435410

70-74 2662421 72 6 15974526

75-79 1341572 77 7 9391004

80-85 1461296 82.5 8.1 11836497.6

Total 145599803 -300302607.4

Mean of average age of Male migrants = A + ∑fd * h

= 42 + (-300302607.4) * 5

145599803

= 42 + -(2.0625) * 5

= 42 - 10.3126

= 31.6874

Calculation of average age of Females


Mid value d = (x-A)
Age group Female (F)
(X) h f*d

0-4 9127975 2 -8 -73023800

5-9 9958059 7 -7 -69706413

10-14 11451227 12 -6 -68707362

15-19 16518666 17 -5 -82593330

20-24 33658466 22 -4 -134633864

25-29 37522017 27 -3 -112566051

30-34 34286096 32 -2 -68572192

35-39 33054887 37 -1 -33054887

40-44 27261236 A = 42 0 0

45-49 23447716 47 1 23447716

50-54 17842986 52 2 35685972

55-59 15192910 57 3 45578730

60-64 14347372 62 4 57389488

65-69 10141196 67 5 50705980

70-74 7033728 72 6 42202368

75-79 3493001 77 7 24451007

80-85 4253695 82.5 8.1 34454929.5


Total 308591233 -328941708.5

Mean of average age of Female migrants = A + ∑fd * h

= 42 + (-328941708.5) * 5

308591233

= 42 + (-1.0659) * 5

= 42 - 5.3297

= 36.6703

Interpretation

In mathematics and statistics the term "mean" is used to summarize an entire set of data by
giving a single value that represents its centre or average value. Also known as the arithmetic
median. Thus, we can state that the average age of Male migrants is 34 years (as calculated
above) and the mean of the average age of Female migrants is 37 years (as calculated in the
above).

Conclusion

The average age of male migrants is approximately 31,69 years. However, the average age for
females is around 36,67. This gap in age can be caused by a variety of factors such as the reasons
behind migration as well as the gender gap of life expectation. Knowing the median age of
immigrants in relation to gender allows them to be part of the community they are staying in and
improves their wellbeing. The policy should provide elderly female migrants with healthcare and
younger male migrants with job opportunities.

You might also like