Likelihood - and - Probability

9/12/2020 What is the difference between "likelihood" and "probability"?
- Cross Validated
Cross Validated is a question and answer

site for people interested in statistics,
machine learning, data analysis, data
mining, and data visualization. It only
takes a minute to sign up.
Sign up to join this community
Anybody can ask a question
Anybody can answer
The best answers are voted

up and rise to the top
SPONSORED BY
What is the difference between “likelihood” and “probability”?

Asked 10 years ago Active 1 month ago Viewed 347k times
The wikipedia page claims that likelihood and probability are distinct concepts.
539
In non-technical parlance, "likelihood" is usually a synonym for "probability," but in
statistical usage there is a clear distinction in perspective: the number that is the
probability of some observed outcomes given a set of parameter values is regarded as
the likelihood of the set of parameter values given the observed outcomes.
455
Can someone give a more down-to-earth description of what this means? In addition, some
examples of how "probability" and "likelihood" disagree would be nice.
probability likelihood intuition
edited Nov 14 '19 at 11:10 asked Sep 14 '10 at 3:24

kjetil b halvorsen Douglas S. Stones
47.5k 9 108 349 6,241 4 15 18
our Great
By using16 question.
site, you I would that
acknowledge add you
"odds" andread
have "chance" in there tooour
and understand :) –Cookie
Neil McGuigan Sep 14Policy
Policy, Privacy '10 at ,5:28
and
our Terms of IService . should take a look at this question stats.stackexchange.com/questions/665/… because
think you
6
Lik lih d i f t ti ti d b bilit f b bilit bi i d S 14 '10 t 6 04
https://stats.stackexchange.com/questions/2641/what-is-the-difference-between-likelihood-and-probability/2647#2647 1/11
9/12/2020 What is the difference between "likelihood" and "probability"? - Cross Validated
Likelihood is for statistic purpose and probability for probability. – robin girard Sep 14 '10 at 6:04
4 Wow, these are some really good answers. So a big thanks for that! Some point soon, I'll pick one I
particularly like as the "accepted" answer (although there are several that I think are equally deserved). –
Douglas S. Stones Sep 15 '10 at 1:13
1 Also note that the "likelihood ratio" is actually a "probability ratio" since is a function of the observations. –
JohnRos Nov 2 '11 at 10:29
11 Answers Active Oldest Votes
The answer depends on whether you are dealing with discrete or continuous random variables.
So, I will split my answer accordingly. I will assume that you want some technical details and not
381 necessarily an explanation in plain English.
Discrete Random Variables
Suppose that you have a stochastic process that takes discrete values (e.g., outcomes of tossing
a coin 10 times, number of customers who arrive at a store in 10 minutes etc). In such cases, we
can calculate the probability of observing a particular set of outcomes by making suitable
assumptions about the underlying stochastic process (e.g., probability of coin landing heads is p
and that coin tosses are independent).
Denote the observed outcomes by O and the set of parameters that describe the stochastic
process as θ. Thus, when we speak of probability we want to calculate P (O|θ). In other words,
given specific values for θ, P (O|θ) is the probability that we would observe the outcomes
represented by O .
However, when we model a real life stochastic process, we often do not know θ. We simply
observe O and the goal then is to arrive at an estimate for θ that would be a plausible choice
given the observed outcomes O . We know that given a value of θ the probability of observing O
is P (O|θ). Thus, a 'natural' estimation process is to choose that value of θ that would maximize
the probability that we would actually observe O . In other words, we find the parameter values θ
that maximize the following function:
L(θ|O) = P (O|θ)
L(θ|O) is called the likelihood function. Notice that by definition the likelihood function is
conditioned on the observed O and that it is a function of the unknown parameters θ.
Continuous Random Variables
In the continuous case the situation is similar with one important difference. We can no longer talk
By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
about the probability that we observed O given θ because in the continuous case P (O|θ) = 0 .
our Terms of Service.
Without getting into technicalities, the basic idea is as follows:
g g ,
Denote the probability density function (pdf) associated with the outcomes O as: f (O|θ) . Thus,
in the continuous case we estimate θ given observed outcomes O by maximizing the following
function:
L(θ|O) = f (O|θ)
In this situation, we cannot technically assert that we are finding the parameter value that
maximizes the probability that we observe O as we maximize the PDF associated with the
observed outcomes O .
edited Mar 24 '19 at 14:27 answered Sep 14 '10 at 6:08

nbro user28
2,093 1 15 31
41 The distinction between discrete and continuous variables disappears from the point of view of measure
theory. – whuber ♦ Sep 14 '10 at 15:48
27 @whuber yes but an answer using measure theory is not that accessible to everyone. – user28 Sep 14 '10
at 20:09
18 @Srikant: Agreed. The comment was for the benefit of the OP, who is a mathematician (but perhaps not a
statistician) to avoid being misled into thinking there is something fundamental about the distinction. –
whuber ♦ Sep 14 '10 at 20:36
7 You can interpret a continuous density the same as the discrete case if O is replaced by dO, in the sense
that if we ask for P r(O ∈ (O′ , O′ + dO′ )|θ) (i.e. probability that the data O is contained in an
infinintesimal region about O′ ) and the answer is f (O′ |θ)dO′ (the dO′ makes this clear that we are
calculating the area of an infinintesimaly thin "bin" of a histogram). – probabilityislogic Jan 28 '11 at 13:40
10 I am over 5 years late to the party, but I think that a very crucial follow-up to this answer would be
stats.stackexchange.com/questions/31238/… which stresses upon the fact that likelihood function L(θ) is
not a pdf with respect to θ. L(θ) is indeed a pdf of data given the parameter value, but since the since L is
a function of θ alone (with data held as a constant), it is irrelevant that L(θ) is a pdf of data given θ. –
Shobhit Jan 8 '16 at 16:04
This is the kind of question that just about everybody is going to answer and I would expect all the
answers to be good. But you're a mathematician, Douglas, so let me offer a mathematical reply.
158
A statistical model has to connect two distinct conceptual entities: data, which are elements x of
some set (such as a vector space), and a possible quantitative model of the data behavior.
Models are usually represented by points θ on a finite dimensional manifold, a manifold with
boundary, or a function space (the latter is termed a "non-parametric" problem).
The data x are connected to the possible models θ by means of a function Λ(x, θ). For any
By usinggiven θ, you
our site, Λ(x, θ) is intended
acknowledge to be
that you theread
have probability (or probability
and understand density)
our Cookie Policy,ofPrivacy
x. ForPolicy
any ,given
and x, on
the
our Terms of other
Servicehand,
. Λ(x, θ) can be viewed as a function of θ and is usually assumed to have certain
nice properties such as being continuously second differentiable The intention to view Λ in this
nice properties, such as being continuously second differentiable. The intention to view Λ in this
way and to invoke these assumptions is announced by calling Λ the "likelihood."
It's quite like the distinction between variables and parameters in a differential equation:
sometimes we want to study the solution (i.e., we focus on the variables as the argument) and
sometimes we want to study how the solution varies with the parameters. The main distinction is
that in statistics we rarely need to study the simultaneous variation of both sets of arguments;
there is no statistical object that naturally corresponds to changing both the data x and the model
parameters θ. That's why you hear more about this dichotomy than you would in analogous
mathematical settings.
edited Jul 30 '19 at 12:19 answered Sep 14 '10 at 15:45

whuber ♦
246k 38 542 960
7 +1, what a cool answer. Analogy with differential equations seems very apropriate. – mpiktas Mar 5 '12 at
20:15
3 As an economist, although this answer does not relate as closely as the previous to the concepts I've
learnt, it was the most informative one in an intuitive sense. Many thanks. – Robson Jan 20 '16 at 14:25
1 Actually, this statement is not really true "there is no statistical object that naturally corresponds to changing
both the data x and the model parameters θ.". There is, it's called "smoothing, filtering, and prediction", in
linear models its the Kalman filter, in nonlinear models, they have the full nonlinear filters,
en.wikipedia.org/wiki/Kushner_equation etc – crow Nov 17 '17 at 18:26
1 Yes, great answer! As lame as this sounds, by choosing Λ (x, θ) instead of the standard notation of
P (x, θ) , it made it easier for me to see that we're starting off with a joint probability that can be defined as
either a likelihood or a conditional probability. Plus, the "certain nice properties" comment helped. Thanks! –
Mike Williamson Aug 18 '18 at 19:20
2 @whuber Yes, I know Λ isn't the usual notation. That's exactly why it helped! I stopped thinking that it must
have a particular meaning and instead just followed the logic. ;-p – Mike Williamson Aug 19 '18 at 15:14
I'll try and minimise the mathematics in my explanation as there are some good mathematical
explanations already.
126
As Robin Girard comments, the difference between probability and likelihood is closely related to
the difference between probability and statistics. In a sense probability and statistics concern
themselves with problems that are opposite or inverse to one another.
Consider a coin toss. (My answer will be similar to Example 1 on Wikipedia.) If we know the coin
is fair (p = 0.5 ) a typical probability question is: What is the probability of getting two heads in a
row. The answer is P (H H ) = P (H ) × P (H ) = 0.5 × 0.5 = 0.25 .
A typical statistical question is: Is the coin fair? To answer this we need to ask: To what extent
does our sample support the our hypothesis that P (H ) = P (T ) = 0.5?
The first point to note is that the direction of the question has reversed In probability we start with
The first point to note is that the direction of the question has reversed. In probability we start with
an assumed parameter (P (head)) and estimate the probability of a given sample (two heads in
a row). In statistics we start with the observation (two heads in a row) and make INFERENCE
about our parameter (p = P (H ) = 1 − P (T ) = 1 − q ).
Example 1 on Wikipedia shows us that the maximum likelihood estimate of P (H ) after 2 heads
in a row is pM LE = 1. But the data in no way rule out the the true parameter value p(H ) = 0.5
(let's not concern ourselves with the details at the moment). Indeed only very small values of
p(H ) and particularly p(H ) = 0 can be reasonably eliminated after n = 2 (two throws of the
coin). After the third throw comes up tails we can now eliminate the possibility that P (H ) = 1.0
(i.e. it is not a two-headed coin), but most values in between can be reasonably supported by
the data. (An exact binomial 95% confidence interval for p(H ) is 0.094 to 0.992.
After 100 coin tosses and (say) 70 heads, we now have a reasonable basis for the suspicion that
the coin is not in fact fair. An exact 95% CI on p(H ) is now 0.600 to 0.787 and the probability of
observing a result as extreme as 70 or more heads (or tails) from 100 tosses given p(H ) = 0.5
is 0.0000785.
Although I have not explicitly used likelihood calculations this example captures the concept of
likelihood: Likelihood is a measure of the extent to which a sample provides support for
particular values of a parameter in a parametric model.
edited Jul 28 at 9:27 answered Sep 14 '10 at 8:45

Joffer Thylacoleo
103 3 4,429 5 21 31
3 Great answer! Especially the three last paragraphs are very useful. How would you extend this to describe
the continuous case? – Demetris Sep 2 '14 at 11:53
8 For me, best answer. I don't mind math at all, but for me math is a tool ruled by what I want (I don't enjoy
math for its own sake, but for what it helps me do). Only with this answer do I know the latter. – Mörre Apr
20 '15 at 13:28
I will give you the perspective from the view of Likelihood Theory which originated with Fisher --
and is the basis for the statistical definition in the cited Wikipedia article.
73
Suppose you have random variates X which arise from a parameterized distribution F (X; θ),
where θ is the parameter characterizing F . Then the probability of X = x would be:
P (X = x) = F (x; θ) , with known θ.
More often, you have data X and θ is unknown. Given the assumed model F , the likelihood is
defined as the probability of observed data as a function of θ: L(θ) = P (θ; X = x) . Note that
X is known, but θ is unknown; in fact the motivation for defining the likelihood is to determine the
By usingparameter
our site, youofacknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
the distribution.
Although it seems like we have simply re-written the probability function, a key consequence of
this is that the likelihood function does not obey the laws of probability (for example, it's not bound
to the [0, 1] interval). However, the likelihood function is proportional to the probability of the
observed data.
This concept of likelihood actually leads to a different school of thought, "likelihoodists" (distinct
from frequentist and bayesian) and you can google to search for all the various historical debates.
The cornerstone is the Likelihood Principle which essentially says that we can perform inference
directly from the likelihood function (neither Bayesians nor frequentists accept this since it is not
probability based inference). These days a lot of what is taught as "frequentist" in schools is
actually an amalgam of frequentist and likelihood thinking.
For deeper insight, a nice start and historical reference is Edwards' Likelihood. For a modern
take, I'd recommend Richard Royall's wonderful monograph, Statistical Evidence: A Likelihood
Paradigm.
edited Oct 23 '15 at 10:09 answered Sep 14 '10 at 5:16

hello_there_andy ars
165 7 11.8k 1 33 53
3 Interesting answer, I actually thought that the "likelihood school" was basically the "frequentists who don't
design samples school", while the "design school" was the rest of the frequentists. I actually find it hard
myself to say which "school" I am, as I have a bit of knowledge from every school. The "Probability as
extended logic" school is my favourite (duh), but I don't have enough practical experience in applying it to
real problems to be dogmatic about it. – probabilityislogic Jan 28 '11 at 13:53
5 +1 for "the likelihood function does not obey the laws of probability (for example, it's not bound to the [0, 1]
interval). However, the likelihood function is proportional to the probabiilty of the observed data." –
Walrus the Cat Jun 13 '14 at 22:27
10 "the likelihood function does not obey the laws of probability" could use some further clarification, especialy
since is was written as θ: L(θ)=P(θ;X=x), i.e. equated with a probability! – redcalx Apr 3 '15 at 19:53
Thanks for your answer. Could you please address the comment that @locster made? –
Vivek Subramanian Jul 1 '15 at 11:48
2 To me as a not mathematician, this reads like religious mathematics, with different beliefs resulting in
different values for chances of events to occur. Can you formulate it, so that it is easier to understand what
the different beliefs are and why they all make sense, instead of one being simply incorrect and the other
school / belief being correct? (assumption that there is one correct way of calculating chances for events to
occur) – Zelphir Kaltstahl Jul 26 '16 at 13:11
Given all the fine technical answers above, let me take it back to language: Probability quantifies
anticipation (of outcome), likelihood quantifies trust (in model).
67
Suppose somebody challenges us to a 'profitable gambling game'. Then, probabilities will serve
us to compute things like the expected profile of your gains and loses (mean, mode, median,
variance, information ratio, value at risk, gamblers ruin, and so on). In contrast, likelihood will
serve us to quantify whether we trust those probabilities in the first place; or whether we 'smell a
rat'.
rat .
Incidentally -- since somebody above mentioned the religions of statistics -- I believe likelihood
ratio to be an integral part of the Bayesian world as well as of the frequentist one: In the Bayesian
world, Bayes formula just combines prior with likelihood to produce posterior.
answered Apr 13 '13 at 20:49

Gypsy
671 5 2
1 This answer sums it up for me. I had to think through what it meant when I read that likelihood is not
probability, but the following case occurred to me. What is the likelihood that a coin is fair, given that we see
four heads in a row? We can't really say anything about probability here, but the word "trust" seems apt. Do
we feel we can trust the coin? – dnuttle Jul 23 '18 at 12:21
2 Initially this might have been the historically intended purpose of likelihoods, but nowadays likelihoods are
every bayesian calculation, and it's known that probabilities can amalgamate beliefs and plausibility, which
is why the Dempster-Shafer theory was created, to disambiguate both interpretations. – gaborous Sep 3
'19 at 8:39
Suppose you have a coin with probability p to land heads and (1 − p) to land tails. Let x = 1
indicate heads and x = 0 indicate tails. Define f as follows

58
x 1−x
f (x, p) = p (1 − p)
f (x, 2/3) is probability of x given p = 2/3, f (1, p) is likelihood of p given x = 1. Basically

likelihood vs. probability tells you which parameter of density is considered to be the variable
edited Jan 10 '19 at 12:08 answered Sep 14 '10 at 6:04

Glorfindel Yaroslav Bulatov
630 1 8 14 4,464 2 21 34
1 Nice complement to the theoretical definitions used above! – Frank Meulenaar Sep 17 '11 at 10:47
I see that Ckn pn (1 − p)k−n gives the probability of having n heads in k trials. Your px (1 − p)1−x looks
like k-th root of that: x = n/k . What does it mean? – Little Alien Sep 1 '16 at 13:29
@LittleAlien what is Ckn in your equation? – GENIVI-LEARNER Jan 25 at 1:02
1 @GENIVI-LEARNER Ckn is the binomial coefficient (see en.wikipedia.org/wiki/Binomial_coefficient). It

allows you to calculate the probability of seeing different combinations of heads and tails (for example:
H T T , T H T , T T H for n = 3 , k = 1 ), instead of all heads or all tails using the simpler
x
f (x, p) = p (1 − p)
n−k
formula. – RobertF Apr 22 at 17:31
If I have a fair coin (parameter value) then the probability that it will come up heads is 0.5. If I flip
a coin 100 times and it comes up heads 52 times then it has a high likelihood of being fair (the
56
numeric value of likelihood potentially taking a number of forms).
answered Sep 14 '10 at 3:44

John
20.3k 5 44 82
9 This and Gypsy's answer should be on top! Intuition and clarity above dry mathematical rigor, not to say
something more derogatory. – Nemanja Radojković Jun 5 '18 at 22:33
is there an intuitive explanation for the formula for calculating likelihood, like we have for binomial
distribution formula that calculates probability ? – d-_-b Jul 18 at 5:28
That sounds like it should be posted as it's own question – John Jul 18 at 22:56
P (x|θ) can be seen from two points of view:
31 As a function of x, treating θ as known/observed. If θ is not a random variable, then

P (x|θ) is called the (parameterized) probability of x given the model parameters θ, which is
sometimes also written as P (x; θ) or P θ (x). If θ is a random variable, as in Bayesian

statistics, then P (x|θ) is a conditional probability, defined as P (x ∩ θ)/P (θ) .
As a function of θ, treating x as observed. For example, when you try to find a certain
assignment θ^ for θ that maximizes P (x|θ), then P (x|θ^) is called the maximum likelihood of
By using our site, you acknowledge that you have read and understand
^
our Cookie Policy, Privacy Policy, and
θ given the data x, sometimes written as L(θ |x) . So, the term likelihood is just shorthand to
refer to the probability P (x|θ) for some data x that results from assigning different values to
p y ( | ) g g
θ (e.g. as one traverses the search space of θ for a good solution). So, it is often used as an
objective function, but also as a performance measure to compare two models as in

Bayesian model comparison.
Often, this expression is still a function of both its arguments, so it is rather a matter of emphasis.
edited Oct 16 '17 at 13:06 answered Nov 27 '15 at 13:41

Lenar Hoyt
803 7 14
For the second case, I thought people usually write P(theta|x). – yuqian Jan 4 '16 at 20:40
Originally intuitively I already thought they're both words for the same with a difference in perspective or
natural language formulation, so I feel like "What? I was right all along?!" But if this is the case, why is
distinguishing them so important? English not being my mother tongue, I grew up with only one word for
seemingly both of the terms (or have I simply never gotten a problem where I needed to distinguish the
terms?) and never knew there was any difference. It's only now, that I know two English terms, that I begin
to doubt my understanding of these things. – Zelphir Kaltstahl Jul 26 '16 at 13:32
3 Your answer seems to be very comrphensive and is easy to understand. I wonder, why it got so few
upvotes. – Funkwecker Feb 16 '17 at 7:18
4 Note that P(x|θ) is a conditional probability only if θ is a random variable, if θ is a parameter then it's
simply the probability of x parameterized by θ. – Mircea Mironenco May 9 '17 at 19:49
i think this is the best answer amongst all – Aerin Oct 11 '17 at 7:08
do you know the pilot to the tv series "num3ers" in which the FBI tries to locate the home base of
a serial criminal who seems to choose his victims at random?
7
the FBI's mathematical advisor and brother of the agent in charge solves the problem with a
maximum likelihood approach. first, he assumes some "gugelhupf shaped" probability p(x|θ)
that the crimes take place at locations x if the criminal lives at location θ. (the gugelhupf
assumption is that the criminal will neither commit a crime in his immediate neighbourhood nor
travel extremely far to choose his next random victim.) this model describes the probabilities for
different x given a fixed θ. in other words, pθ (x) = p(x|θ) is a function of x with a fixed
parameter θ.
of course, the FBI doesn't know the criminal's domicile, nor does it want to predict the next crime
scene. (they hope to find the criminal first!) it's the other way round, the FBI already knows the
crime scenes x and wants to locate the criminal's domicile θ.
so the FBI agent's brilliant brother has to try and find the most likely θ among all values possible,
i.e. the θ which maximises p(x|θ) for the actually observed x. therefore, he now considers
lx (θ) = p(x|θ) as a function of θ with a fixed parameter x. figuratively speaking, he shoves his
By usinggugelhupf around
our site, you on thethat
acknowledge map until
you haveit optimally "fits" the known
read and understand crime
our Cookie scenes
Policy x. the
, Privacy FBI, and
Policy then goes
^
knocking
our Terms on
of Service. the door in the center θ of the gugelhupf.
to stress this change of perspective, lx (θ) is called the likelihood (function) of θ, whereas pθ (x)
was the probability (function) of x. both are actually the same function p(x|θ) but seen from
different perspectives and with x and θ switching their roles as variable and parameter,
respectively.
edited Nov 25 '19 at 17:30 answered Jun 27 '19 at 22:37

schotti
101 1 6
As far as I'm concerned, the most important distinction is that likelihood is not a probability (of θ).
5 In an estimation problem, the X is given and the likelihood P (X|θ) describes a distribution of X
rather than θ. That is, ∫ P (X|θ)dθ is meaningless, since likelihood is not a pdf of θ, though it
does characterize θ to some extent.
answered Nov 6 '17 at 9:45

Response777
137 2 4
1 As the answer from @Lenar Hoyt points out, if theta is a random variable (which it can be), then likelihood
is a probability. So the real answer seems to be that the likelihood can be a probability, but is sometimes
not. – Mike Wise Dec 5 '17 at 17:47
@MikeWise, I think theta could always be viewed as a "random" variable, while chances are that it is just
not so "random"... – Response777 Dec 6 '17 at 15:18
If we put the conditional probability interpretation aside, you can think it in this way:
2 In probability you usually want to find the probability of a possible event based on a
model/parameter/probability distribution, etc.
In likelihood you have observed some outcome, so you want to find/create/estimate the most
likely source/model/parameter/probability distribution from which this event has raised.
edited Jan 25 at 11:24 answered Nov 6 '19 at 12:00

Ahmad
267 2 11
1 This seems to me to miss the point completely. Probability and likelihood are not to be distinguished in this
way. (My edits are only linguistic.) – Nick Cox Nov 24 '19 at 12:18
1 Sorry, but formal or informal style isn’t the issue. The distinction isn’t in terms of past and future. This only
adds confusion to the thread, and I have downvoted if as wrong. – Nick Cox Nov 24 '19 at 12:40
By using1our @NickCox
site, you acknowledge that you have
I'm not an statistician, read
but isn't and understand
probability our Cookie
about events Policy
we don't , Privacy
know Policy
the result , and
beforehand?
our Terms of and
Service .
likelihood about observations? And observation is an event has occurred! I really myself don't want to
be very pedantic just an intuition that works in most situations – Ahmad Nov 24 '19 at 12:44
be very pedantic, just an intuition that works in most situations. Ahmad Nov 24 19 at 12:44
1 The thread already has several excellent, much upvoted answers. That is not a situation in which anyone
not confident of their expertise need or should add another. Any interest in the future is not the issue as in
practice both probability and likelihood are calculated from data already to hand. – Nick Cox Nov 24 '19 at
13:45
1 -1 Intuitive answers are good--when they are correct. This one just is misleading and wrong. – whuber ♦
Nov 24 '19 at 22:25
Highly active question. Earn 10 reputation in order to answer this question. The reputation requirement helps
protect this question from spam and non-answer activity.

Likelihood - and - Probability

Uploaded by

Copyright:

Available Formats

You might also like

Likelihood - and - Probability

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Likelihood - and - Probability

Uploaded by

Copyright:

Available Formats

9/12/2020 What is the difference between "likelihood" and "probability"?

Cross Validated is a question and answer

Sign up to join this community

Anybody can ask a question

Anybody can answer

The best answers are voted

What is the difference between “likelihood” and “probability”?

probability likelihood intuition

edited Nov 14 '19 at 11:10 asked Sep 14 '10 at 3:24

11 Answers Active Oldest Votes

Discrete Random Variables

Continuous Random Variables

edited Mar 24 '19 at 14:27 answered Sep 14 '10 at 6:08

edited Jul 30 '19 at 12:19 answered Sep 14 '10 at 15:45

edited Jul 28 at 9:27 answered Sep 14 '10 at 8:45

edited Oct 23 '15 at 10:09 answered Sep 14 '10 at 5:16

answered Apr 13 '13 at 20:49

indicate heads and x = 0 indicate tails. Define f as follows

f (x, 2/3) is probability of x given p = 2/3, f (1, p) is likelihood of p given x = 1. Basically

edited Jan 10 '19 at 12:08 answered Sep 14 '10 at 6:04

@LittleAlien what is Ckn in your equation? – GENIVI-LEARNER Jan 25 at 1:02

1 @GENIVI-LEARNER Ckn is the binomial coefficient (see en.wikipedia.org/wiki/Binomial_coefficient). It

answered Sep 14 '10 at 3:44

P (x|θ) can be seen from two points of view:

31 As a function of x, treating θ as known/observed. If θ is not a random variable, then

sometimes also written as P (x; θ) or P θ (x). If θ is a random variable, as in Bayesian

objective function, but also as a performance measure to compare two models as in

edited Oct 16 '17 at 13:06 answered Nov 27 '15 at 13:41

edited Nov 25 '19 at 17:30 answered Jun 27 '19 at 22:37

answered Nov 6 '17 at 9:45

edited Jan 25 at 11:24 answered Nov 6 '19 at 12:00

You might also like