Download as pdf or txt
Download as pdf or txt
You are on page 1of 481

OK, this is totally what I want to hear.

Right.

each

Note that to calculate this, you'd first reduce 6/36 to 1/6.

I see. This helps with the place of the counting techniques surveyed in Bersektas.

So, this is obviously very easy.

OK, so this analogy of filling slots, certainly makes sense of/justifies this, as giving the number of permutations. (But review the previous page, if the justification loses its clarity. The justification becomes clear, when one thinks of the tree. First selection defines the number of branches in the first round; then the second is the number of sub-branches on each of these; etc. That's why you MULTIPLY the numbers together).

Mmm. Again, this analogy helps a lot.

One feels it has been treated far more simply, than in Bersektas.

As per what has been just worked out.

This language is less meaningful to me, but I think it will come up quite a bit.

Again, it feels that the discussion has been infinitely more simple and smooth. Although possibly we were tackling some more complicated cases?

So, this is just plugging into the formula.

This explanation doesn't make much sense to me, although if one refers to the table, the first column of which is the List of Committees, one gets some sense of what is meant.

Note that this certainly works out by the formula, where in the denominator you multiply r! by (n-r)! For then your numbers will just swap places, in terms of the variable that represents them. Instead of 'r' being 4, it will be 13. Then, 'n-r' will be 4, instead of 13.

Recall that 0! = 1.

Quite interesting, to meet a case where we actually see the mathematical arbitrary definition in action.

Mmm, yes indeed.

Yes, OK.

First, note that it is actually OBVIOUS that the answer is 5/52. There is one Qs, so the chances of picking it from the pack are 1/52. But you get 5 picks in a hand.

The key is that they're thinking of the different 5-card combinations that exist, which include a Qs. If you then divide this by the total number of hands (5-card combinations), you've got the probability.

That is, combinations of cards, which also include the Qs.

The manipulations here are pretty easy.

Note that " favorable" here means NOT getting a Qs.

Yep

So, this is the number of possible COMBINATIONS of all-heart hands (or indeed, 5 cards all of any single suit).

Recall (page 19) the analogy of filling 5 slots, where initially we choose from a pool of 13, then from 12, etc.

So, note this. Although the answer may turn out the same, the fractions in (6) and (7) are not the same. The numbers in (7) are bigger. We are dividing a bigger numerator by a bigger denominator - so, in the end, the ratio is the same.

That you should multiply these individual combinations together, is not obvious. If thinking of filling slot, the point is that for one of the ways of filling slot 1, you have, then, a whole number of ways that filling slot 2 can be paired with this. Similarly for the next way of filling the slot. Or, on the tree representation each of the branches, has a number of sub-branches corresponding to, you know, the sumber of possibilities at this second level.

So, what I said about multiplication in the note, applies identically here.

I guess one way to appreciate this, is to see it as the first splitting of the tree.

So, there is nothing going on here that is different from the other cases.

Not too sure about this.

This is the vowel. I suppose choosing 1 out of 5 reduces to this.

OK, so basically, I think this term is maybe indicating that we can arrange each of our chosen elements in a different order. For EACH set of seven elements we choose, you know, we can arrange them in different slots, and there are 7! options each time.

26 letters and 10 digits. And (page 19) this is the way you count a permutation of, you know, n elements from among a larger number.

So, maybe an illuminating summary description of this, is to say that the permutations are occurring WITHN the combination(s). I must say that I feel I'd have no chance of working this out on my own. (See below).

....OK, so this at least gives us some insight into the sort of method we might use to solve such problems. One follows something of a script.

I think that I can appreciate that I might develop an " eye" for spotting things like this more easily.

Boiling things down into elementary situations like this, helps.

Yes, clear.

So, it IS the case that there is a different overall probability, when you are dealing with- and without replacement.

OK.

What is a bit difficult to see, is why it is called specifically "symmetry", as opposed to, say, " equivalence". I suppose that maybe because you have a situation of "unchanged/indistinguishable after transformation".

So, note that this was given in Bersektas (p24) as one of the 'properties' of probability laws that could be derived from the basic axioms. But, so, I remain a little confused as to whether, by contrast, it is simply a definition of exclusive OR. Basically, the ensuing discussion below makes things clear. Simply - it is surprising.

So, what is at first surprising, is that it should apparently ALWAYS work out that, in adding back in the events that are situated where ALL THREE overlap, one will be adding back in ONLY those that have been removed twice. However, note that actually, in the example, p4 is subtracted three times, rather than just twice, so it seems the procedure is not sensitive to this. ....I still don't SEE how it works though.

This feels like a new way of going about things that they have followed here. Perhaps follows from the fact that we are removing (are required to remove) ALL the instances of our required type, from the pool. (Not the case, since the same method is applied next page).

Since it is a case of OR, we are not, like, drawing from the pack, and then drawing a further 4 from the depleted pack. So, you know, the sample space remains.

So, this is the prob of 8 specified cards

Possibly the same condition as Bersektas calls "disjoint".

It feels quite instructive to consider the problem "all spades in a hand of 5" without the buildup of the previous pages. You might begin thinking "13 in a suit, out of 52". But then also "But we only need 5 specific cards out of the 52". Etc.

So '52 choose 5' is your total sample space, your total number of possible hands. '48 choose 5' is your total number of non-ace hands. So basically we take the COMPLEMENT of this event, to get the prob of the event: 'at least one ace'.

Yes, note that, since we are pairing terms together. So, it is a committee/combination of 2 terms (since order doesn't matter).

Again, note how here we are giving the probability of getting a specific card (ace), via giving the probability of getting a hand containing the OTHER components of that hand-which-has-an-ace. What this expression says is that, among the total set of hands (52 choose 5), there are the given number of different hands containing an ace. The number (51 choose 4) is the number of hands that contain 1 ace. We divide that by the number of hands simpliciter. The key is this. Don't try to jump the gun. The numerator term is simply giving the NUMBER OF HANDS containing an ace. It is the expression as a whole that gives the PROB. The numerator should be seen as obviously giving that, because we can choose any 4 cards, from amongst 51.

So note how, because there is a common denominator, and it is fractional addition, you just end up with this denominator.

So, I think that here, we are dealing with MUTUALLY EXCLUSIVE events, so our 'or's can be turned into simple additions. Regarding the example as a whole, clearly, it makes sense that, to get at "at most", you take the COMPLEMENT of the probability of the events that EXCEED that number.

Again, the familiar method being employed. Numerators are giving the NUMBER OF HANDS containing required number of (or absence of) spades. Denominator is giving total number of hands.

So, the first term gives the number of hands, amongst the total, than contain 2 spades. The second term subtracts away, from amongst these hands (which will contain also all those hands containing hearts) the number of hands that contain 2 spades AND which, among the other 3 cards, contain no cards from this other 13-card suit (hearts).

Pretty obvious what is happening in this case, I think.

So, it is the inclusive or. So, we are getting the probability of getting either.

...and it is the complement of this that we calculated, above.

One way to think about this which seems to work, is to consider the MINIMAL way you negate the event. Thus, with at least one of each suit, the condition is broken as soon as you don't have a member of ONE of the suits.

Should be a fair bit steeper.

So, this method is also used in Bersektas, p23.

So, one appreciates that this is an extremely useful visualization.

Makes sense to do this. Such an area is basically the size of the sample space.

So, this is very easy.

I think it is a piece of luck that the area here can be computed so easily, i.e. that the triangle happens to have two equal sides.

This is a bit abstract to know exactly what is meant.

Yeah, fine.

See Bersektas p22. The reason is that, if there was a positive probability and you add together all such, they would exceed 1, being infinite in number.

Interesting.

This is the 'universe' specifying possible positions of the pin. All possible combinations of angles and values of 'y'.

This integral, and re-arrangement, is actually fine.

This thing of the specification of a new universe, is mentioned also by Bersektas, p29-30.

Quite an understandable way of putting it, I think.

The formula given in Bersektas 29-30, where it is derived more formally.

So, note that you are getting the double fraction because the numerators, in each case, are giving you numbers of hands, and the denominators are turning the expressions into probabilities. But then, you know, this is the case for each probability there, either side of the fraction in the formula.

So, note the point that the "at least one picture" in the numerator here, is the conditioning event A, in the numerator from the formula P(A and B)

Fine with this.

Not sure how this simplification works out.

Oh yes, I see this.

Stated in Bersektas p32.

So, this is getting to what Bersektas calls the Multiplication Rule, page 34.

Clearly, this is drawing on the combinations formula (page 20). But I don't see why it is the reciprocal, nor how this instantiates the stuff just covered. OK, so, first, it is not supposed to instantiate the new material (shown instead in the next method). Second, there is only one favourable committee. The number of combinations giving the total, is indeed supplied by the combinations rule, and it is the reciprocal, you know, just because that total is on the denominator.

OK, this example makes a lot of intuitive sense of this expansion of the conditioning rule, the Chain Rule or Multiplication Rule from Bersektas.

Simple stuff, obviously.

So, obviously I'm not totally confident about what is going on with sequences/series, but basically, you have this formula (namely, here, a/1-r) that generates any member of the series, that is appropriate for cases where the series is of the form a+ar_ar^2 etc. So apparently, you can just get the probability from the formula.

Interesting.

I think this may be using the chain rule for AND, from above.

Not sure how this is solved

Mmmm, yes, OK.

OK. Don't think this situation is covered in Bersektas.

An interesting distinction.

I.e. each of the permutations.

The places you pick are going to end up determining the permutation.
OK. I think I fairly follow, although why we use division, you know, as opposed to subtraction or something - I don't know.

OK yes, I follow this, as a derivation, much better.

Fine.

Again, fine.

Quite an interesting property; worth thinking about.

Yes, OK. Very simple.

OK...... So, this is just drawing on the analysis of the previous chapter.

OK, this example seems simple enough.

One can see it. And its useful!

Ah yes, indeed one sees that it is a special case.

The probability of the outcome 'k', times the number of such trials; same with 'q'.

Permutations.

OK, interesting.

Fine.

Yes OK. Just note how it is set up.

Mmmm. So, there is no way that I would have been able to know that these two situations should be treated in these very different ways. I suppose that, if you think about it, in the first case, you have the extra "10 choose 8" because you are giving the NUMBER OF PERMUTATIONS that have the requisite number of successes. Whereas in the latter case it is just this single permutation.

OK......

'k' successes in 'n' trials.

OK, yes I see.

Certainly, this makes a lot of sense.

OK.

This formula seems good.

This all, is the one-head case.

OK, yes. Though notice that there is nothing in this formula that is trying to represent that, you know, the first 9 failures come specifically BEFORE the success.

Not sure about this.

So, everything here except the final 'p', is - is it not? - just instantiating the Binomial formula - k successes in n trials.

So, this is the formula for combinations/committees.

This is surely a typing error

Certainly doesn't look too hard to do these simulations.

Because these are, so t speak, self-sufficient experiments.

Can't fully recall the AND rule is it just adding probabilities?

OK.

Bersektas page 38.

So, this echoes the talk about 'partitioning' in Bersektas.

Yes, recall this.

One can wrap oneself into knots, but really it is not so difficult.

OK fine. Obviously, very similar example examined in Bersektas.

Important philosop hical stuff.

So, I am not presently "up" on series...

OK, interpreting this way is fine.

In the circumstances given by the conditions.

OK, so this is just a re-writing of the 'binomial coefficient', where you have n! / k! * (n-k)! See page 61 above.
Note that the denominator in this term and the "main term" have swapped. In other words, where, for the p^k term, substitution would have given you lambda^k / n^k, we have put that n^k denominator under the main term, and put the k! under this term.

Ah, we hit a wall of my lack of knowledge of series. So basically, this is going to mean that I don't understand the DERIVATION, or rather, the equivalence of the Poisson to the Bernouille.

OK, so we see why there is this need to be able to alternatively use the Poisson.

So, note that here, we HAVE just written out the Poisson formula with each of the 4 cases and added them. It is simply that we factored out the common factor of e^-30

Average number of successes.

OK fine.

Obviously, just plugging in.

Yeah fine - a copy of the one on the previous page.

Clever setup.

Very int.
Mmm, I see.

Well, note this point.

So, this is the weighting part.

I think the fraction here comes from having created a common denominator.

So: formerly, 'k'.

OK yes. We recall this from the previous chapter.

See page 62, for where the idea of the 'geometric distribution' is first introduced. Basically, you are finding the probability of the bunch of unsuccessful trials (and it is construed as all the trials up to the trial before you succeed).

Focus on what it is we're talking about. It is the VALUE OF SOME PROPERTY (called the 'random variable'). The 'expected value' is the value this property IS DISCOVERED TO HAVE when we've applied the appropriate probability formula.

Pretty simple.

So, if I recall properly, this is the average value.

I don't follow this proof. (And this is laughably unfriendly).

Again, it just is not really very clear to me what is going on here.

I.e. 'n' trials, and our prob of occurrence is 'p'.

Indeed, obviously the expansion of the sum

The number of trials.

That is to say, we want the expected values of these.

So, this is the value that can be taken on, multiplied by the probability of occurrence of that value.

Must say that the presentation has been too abstract/sketchy for me to follow properly, really.

Yes, certainly this is intuitive.

Fairly helpful summary discussion.

Our "random variable".

OK fine. Though, obviously, don't feel I understand more.

Is this supposed to tell us that the probability would be the same for any?

Bernoulli trials introduced page 61. Basically, when there are two outcomes.

The 'random variable'.

The value we expect 'X' to have.

Seems we choose '1' for F (failure) since we're interested in 'number of tries before success'. So our 'hits' are 'F's.

So, it doesn't count 'F's, but rather, the number of trials it takes to get an S. ...see

So, this is the success...

I.e. their probability, times the number of these, i.e. the number of events having this probability.

So here, we have recognized the series, and given its formula. See page 47.

So this (for example) is the probability of getting two failures, I think (i.e. the first two as failures).

Mmm yes. I see this reasonableness.

Yes, OK, note that point.

Is there, you know, something I should be aware of about this?

So, this is giving the 'expected value' (and drawing upon the result from the discussion just above, to get it).

1/p. Then: 1/(10/24), which would be 1/1 * 24/10, etc. And we apply this, because it is a case of "expected number of Bernouilli trials till success, as covered by (4) above.

OK, this is pretty informative.

So, we are focusing on the black balls, since we want to find the number of balls before a white. (That is our chosen random variable X).

The success.

To see this, just focus on the fact that, here, we are basically treating every B as totally distinct - such that indeed it would be confusing to call them all 'B'. For we are here treating the prob of a particular B.

The value of the card, times its probability of occurring.

This is pretty much fine.

Yup, OK. Just remember what these numbers are - an average value of the number of reds from each urn.

Because P(heads) = 0.4

OK.

Yep fine. Refer back to the previous example if it stops making sense.

Again, don't really follow. Not really too sure what

Fractions give probabilities. E.g. 3 ways of getting 4.

So, this is 1/9 plus 36/9, i.e. 45/9, i.e.5.

The number of rolls, GIVEN that it is one of these results?

I.e. this "player's point" business.

Because there are 6 ways of getting a 7.

OK, makes sense.

Expectation of the number of rolls, GIVEN first roll is 4.

So, to solidify this, the point is that, if a 4 comes up, then you are going to keep rolling till either a 4 or a 7 turns up. So, naturally, the further number of rolls is going to be determined by the probabilities of those two coming up. Of course, we have also to factor in the probability of getting a result on the first throw.

So, note how it indeed matches the form of the 'theorem of total expectation', above.

I don't know how we would solve this.

Other name for expected value.

OK.

I seem to get pretty much nothing in this review.

OK, so note that that is what f (x) signifies - a function yielding the density, at a point 'x'.

Pretty understandable so far, surely.

Producing probabilities, as a function of points.

This seems a usefully picturesque analogy.

Again, what a lovely image!

Note this detail.

Yes, I see. Integrating over the relevant limits of integration (corresponding to the interval).

All seems OK.

Comfortable with this.

One can appreciate how they 'work'.

Happy with this integration.

Not too sure about this business.

The probabilities over these various intervals. Feel I understand the idea.

Makes sense.

...because the EXACT points 'a' and 'b' have zero prob.

Quite interesting.

Seems an important point

So, this is getting you the probability, not the prob density.

Picturing the height as giving the density, seems like a useful analogy.

Probability per unit length.

Pretty comprehensible, it would seem.

very comprehensible So, just bear in mind that the PDF here is giving CUMULATIVE probs.

Not clear what is gained by this.

Portion of distance covered, divided by total distance. Remember that what we are giving is a breakdown of f(x), which is probability density, or probability per unit length.

Makes sense.

Useful visualization.

i.e. dollars, winnings.

All seems fine.

Yes, fine.

Very comprehensible

A little bit of a problem properly interpreting.

Definitely need a diagram to properly interpret.

Right, yes, fine.

The nub of it.

Clear.

OK note this.

The height after, and before, 'the chunk of prob is fed in'.

So, you slice off that bit more probability (equal to the prob fed in at the jump), if you get the cumulative prob greater-than OR EQUAL TO the value at which there is a jump.
I.e. here, whether less-than, or less-than-or-equal makes no difference since there is no jump. And, the function at this point churns out 1/8.

Yeah, fine.

Note the diff.

Note this distinction.

OK, yes, so this gives the relation

Seems fine.

Not totally sure why the integral works out as that.

The cumulative probability up to the point 0. (Rather than, the cumulative prob up to the point x. And yes, the function supplying this prob, ends its realm of application at 0.

Not totally sure how the integral works out as that. But still, one gets the idea of what happens in this case, where you add the contribution of the two continuous distributions.

OK.

I understand that there are no jumps, but don't feel I 100% get the notation.

Note.

I.e. the derivative.

So basically, you differentiate each sub-component of F(x).

So, with this differentiation, you are using the power rule. Which means you would have 2 * 1/8, which is going to give you 1/4.

Useful for clarity, clearly.

Note this distinction, for understanding.

All usefully clarifying.

Our 'random variable'; our property of interest.

So, does this mean that it is going to give the distribution of probabilities of waiting times. E.g. a certain probability that the waiting time will be.....

I THINK I get the sense of this.

Can't recall the exact differentiation rule being used here. Could look it up...

Do not really follow where this comes from. So the derivation as a whole is not very informative to me.

Note the result, anyway.

So, this example at least tells us what is going on; concretizes. You can find a numerical probability, for a particular value of a random variable.

So, not totally confident how to carry out this integration.


I'm not totally confident how you get rid of the minus.

So, you know, the exponenti al is the function that models the probability distributio n for this sort of situation.

So, this expresses the point about independence.

So, the parameter '2' has been put into the function.

Right, so, again, this is just the point about independence.

Have not followed this.

Note the distinction.

Yes I think I can see that the gamma collapses to the exponential if 'x' is set to 1.

Have not followed this derivation.

The graphs help remind what the density is all about. The probabilit y of values occurring in those regions.

Think I understand pretty well what is going on here.

So, half of the probability has accumulated at 0 - what you would obviously expect.

Have not followed this proof.

I get this, although it is not easy to keep the point in mind.

OK, so (3) works with actual values.

OK yes, note that that is equivalent to the previous.

The location of the mean

So, you know, I follow all the steps, but I am not really sure of, like, what it is exactly that we're doing here.

Yes, fine.

Sort of, a bit unsure what is being said.

Very much kind of losing track of exactly what is being said.

number of successes.

So, we'd have to remember how to actually calculate this.

You know, I am sort of following.

OK. I vaguely recall that this is the method.

I think this is from lookin up 6/9 in the table...

Again, a sort of iffy understanding.

You might also like