# Probability/Statistics (really lost)

1. Oct 18, 2013

### anonymousk

I'm not a native English speaker, and this assignment wasn't originally in English, so I had to translate it to English, thus the grammar errors, but hopefully it's understandable)
-----------------------------------
For an investment in an equity assessed in general to be the following probabilities attached to the exact number of days in a row with positive returns

One day with a positive return , 2 day negative return 26.7%

2 consecutive days with positive returns , 3. day negative return……………. 13.1%

3 consecutive days with positive returns , 4. day negative return……………. 5.3%

4 consecutive days with positive returns , 5. day negative return …………….1.4%

5 consecutive days with positive returns , 6. day negative return …………….0.7%

6 or more consecutive days with positive returns …………………………………….0.0%

For example , there seem to be a probability of 26.7% on any given workday to achieve a positive return and then on the following day to achieve a negative return. Similarly, there seem to be a probability of 13.1% on any given workday to achieve a positive return and then on the following day also to achieve a positive return while on the third day to achieve a negative return.

a) Calculate by using the table above the probability of maximum n days in a row with positive daily returns by investing in its shares . The calculation must be made for n = 0, 1, 2, 3, 4, 5.

b) Calculate the mean, variance and standard deviation of the exact number of days in a row with a positive return on investment in its shares.

c) Calculate the probability to achieve a positive return n or more days in a row by investing in the share. The calculation must be made for n= 0, 1, 2, 3, 4, 5.

d) Calculate, with previously acquired results from c) the probability for a positive return on any given workday in the investment of shares GIVEN/granted that the share has had positive returns the previous n days. The calculation must be made for n= 1, 2, 3, 4.

E(X)=P(Xi)*Xi
Var(X)=(Xi-E(X))2

I've been sitting with this assignment for hours, but just feel completely lost and have given up. To me it seems like they are, more or less, asking for the same thing in a) c) AND d), so yeah.. guess you can say probability isn't my strong side.

This is what I've done so far(though probably wrong):

P(0)=1-(P(1)+P(2)+P(3)+P(4)+P(5))
P(0)=1-(0,267+0,131+0,053+0,014+0,07)
P(0)=1-0,472
P(0)=0,528

n(0)=0,528
n(1)=1-P(0) =1-0,528=0,472
n(2)=1-P(0)+P(1) =1-0,795=0,205
n(3)=1-P(0)+P(1)+P(2) =1-0,926=0,074
n(4)=1-P(0)+P(1)+P(2)+P(3) =1-0,979=0,021
n(5)=1-P(0)+P(1)+P(2)+P(3)+P(4)=1-0,993=0,007

As I said before, I have no idea if what I've done can be used or even what question I've answered (or have not answered)

b)
E(X)=0,528*0+0,267*1+0,131*2+0,053*3+0,014*4+0,007*5+0*6=0,779

Var(X)=0,528*(0-0,779)2+0,267*(1-0,779)2+0,131*(2-0,779)2+0,053*(3-0,779)2+0,014*(4-0,779)2+0,007*(5-0,779)2+0,0*(6-0,779)2=1,060159

I'm not used to forums really, so I hope I've done this right. If not, just tell me, and I'll fix it.

Last edited by a moderator: Oct 18, 2013
2. Oct 18, 2013

### Ray Vickson

Are there only two "return" values (P and N)? That is, are all the positive returns the same, and all the negative returns the same? Furthermore, how is the multi-day return calculated: is it added (to get
$$\text{3-day return} = P_1 + P_2 - N_3$$
for example) or is it obtained through multiplication, as in
$$\text{3-day return} = (1+P_1/100)(1+P_2/100)(1-N_3/100)-1?$$
For small P_i/100 and N_i/100 there is not much difference between the two, but the second form corresponds more realistically to how it is done in financial institutions.

If all the P_i and N_i are different I don't think you have enough information to estimate the one-day return distribution with any kind of confidence.

Finally, the behavior of the returns argues against any type of simple probability model, because positive runs are consistently longer than negative runs in each P-N run pair, and that suggests that successive returns are far from random or independent. I suspect they are not even Markovian.

3. Oct 19, 2013

### anonymousk

Thanks for the reply, but I'm still as lost as before.

4. Oct 19, 2013

### D H

Staff Emeritus
These are very different questions.

(a) asks for the probability of N or fewer work days of consecutive gains, followed by a loss.
Hint: Note that the probabilities do not add to 100%. Why not?

(c) asks for the probability of N or more work days of consecutive gains (day N+1 may or may not be a loss).
Hint: The probability of 0 or more work days of consecutive gains is 100%. Why?

(d) asks for the probability of a gain on some work day given a gain on the N previous work days.
Hint: You'll need the formula for conditional probability to answer this question.

5. Oct 19, 2013

### anonymousk

(a): The probabilities do not add to 100% because we do not know P(0)? So I'm thinking I have to find P(0) by doing 1-(P(1)+P(2)+...+P(5)), but don't know where to go from there to find the probability for max/highest n days in a row with positive daily returns.

(c): Again, same with (a), if P(0) is included, the total probability is 100%?

(d): I have used the conditional probability formula before, but there seems to be more than 2 events here. I dont even know where to start putting in numbers.

EDIT: Okay, so I think I figured out (a)

a)

N(0)= 100-47,2=52,8 %
N(1)= 100-20,5=79,5 %
N(2)=100-7,4=92,6 %
N(3)= 100-2,1=97,9 %
N(4)=100-0,7=99,3 %
N(5)= 100-0=100 %

So now I've found via calculations that the number of max n days with a positive return is 5, correct? Now to c and d, hehe!

Last edited: Oct 19, 2013
6. Oct 19, 2013

### D H

Staff Emeritus
Correct. Question (a) asked for P(N≤n). Question (c) is the other way around. It's asking for the probability of a positive return on n or more days. In other words, it's asking for P(N≥n). You already have the answer to question (c) in your answer to question (a).

7. Oct 19, 2013

### anonymousk

Should it be N(0), N(1) like I wrote it before, or is the correct way to write it P(0) instead of N(0), etc? Just to be sure.

so it should be
P(0)= 100-47,2=52,8 %
P(1)= 100-20,5=79,5 %
P(2)=100-7,4=92,6 %
P(3)= 100-2,1=97,9 %
P(4)=100-0,7=99,3 %
P(5)= 100-0=100 %

And can you quickly clarify the difference in N and n? I'm having a hard time wrapping my head around this/probability.

Edit:

c)

p(0)=52,8
p(1)=100-P(0) =100-52,8=47,2
p(2)=1-P(0)+P(1) =100-79,5=20,5
p(3)=1-P(0)+P(1)+P(2) =100-92,6=7,4
p(4)=1-P(0)+P(1)+P(2)+P(3) =100-97,9=2,1
p(5)=1-P(0)+P(1)+P(2)+P(3)+P(4)=100-99,3=0,7

So this is the answer for c) ? (and again, it should be P(0) etc?

And thanks by the way! Your help is very much appreciated! :)

8. Oct 19, 2013

### D H

Staff Emeritus
That's a very standard notation. For example, P(X≤x) is a function of some variable x. What this shorthand means, in words, is "for a given value of x, what is the probability that a value drawn from the random variable X will have a value less than or equal to x."

You'll see P(X≤x) a lot. It comes up so often that it has been given a name. P(X≤x) is the cumulative probability function for the random variable X, or CDF for short. What question (a) was asking for was the CDF for this particular probability distribution.

Close, but no cigar. Think about what the question is asking for.

Hint: You will *always* (i.e., probability = 100%) have 0 or more days without a loss.

9. Oct 19, 2013

### anonymousk

so the correct way to write it, was it:
N(0)= 100-47,2=52,8 %
N(1)= 100-20,5=79,5 %
N(2)=100-7,4=92,6 %
N(3)= 100-2,1=97,9 %
N(4)=100-0,7=99,3 %
N(5)= 100-0=100 %

or with P(0)=100-47,2=52,8 % etc?
Just want to properly write it down.

giving c) another shot:

?(0)=100
?(1)=P(0)-P(1) =100-47,2=52,8
?(2)=P(0)-P(1)+P(2) =100-47,2-20,5=32,5
?(3)=P(0)-P(1)+P(2)+P(3) =100-47,2-20,5-7,4=24,9
?(4)=P(0)-P(1)+P(2)+P(3)+P(4) =100-47,2-20,5-7,4-2,1=22,8
?(5)=P(0)-P(1)+P(2)+P(3)+P(4)+P(5)=100-47,2-20,5-7,4-2,1-0,7=22,1

hmm, that's definitely wrong.

EDIT: I gave it another shot, but if this is correct, shouldnt N(6), n(6) or P(6) (I really haven't gotten the hang of those yet) give 0? and not the 5th day like below?

?(0)=100-52,8=47,2
?(1)=100-79,5=20,5
?(2)=100-92,6=7,4
?(3)=100-97,9=2,1
?(4)=100-99,3=0,7
?(5)=100-100=0

(and again, i put the ? in there because I'm unsure of what letter to use. Is it the probability that i find now? Thus it should be P(X) ?)

Last edited: Oct 19, 2013
10. Oct 19, 2013

### D H

Staff Emeritus
You were correct when you had a probability of zero or more days of consecutive gains as 100%. Think about it: How could this probability be anything *but* 100%? Zero or more days is the full sample space.

Look at n=1, the probability that you have one or more days of positive gains. One way to look at this is that you have exactly one day of positive gains (followed by a loss), or exactly two days of positive gains (followed by a loss), or ...

Another way to look at it is that you do *not* have n-1 or fewer days of consecutive gains. In other words, it's the opposite of question (a), but offset by one.

Yet another way to look at it is to build from large values of n on down. What's the probability you will have six or more days of consecutive gains? The probability you will have five or more days of consecutive gains is the probability you will have exactly five days of consecutive gains plus the probability you will have more than five days of consecutive gains. Note that "six or more days" is the same as "more than five days"; this is a discrete probability distribution. Now you should be able to answer the question for four or more days, then three or more, etc.

11. Oct 19, 2013

### anonymousk

P(N>5)=0
P(N>4)=0+0,7=0,7
P(N>3)=0,7+1,4=2,1
P(N>2)=2,1+5,3=7,4
P(N>1)=7,4+13,1=20,5
P(N>0)=100

Very skeptical about this. We are currently 2 guys sitting here trying to figure this out. We feel like we're getting close, but it seems we need an extra hand. (not that we haven't already been given plenty of hands from you already ;P )

EDIT:

Another try:
P(N≥5)=0+0,7=0,7
P(N≥4)=0,7+0,7=1,4
P(N≥3)=2,1+1.4=3,5
P(N≥2)=7,4+3,5=10,9
P(N≥1)=20,5+10,9=31,4
P(N≥0)=100

Last edited: Oct 19, 2013
12. Oct 19, 2013

### D H

Staff Emeritus
Neither one is correct.

Starting from n=6, P(N≥6)=0.

The next step down is n=5. You can have a run of five straight days or longer with positive gains if you have a run of more than five straight days with positive gains (probability=0) or if you a run of five days with positive gains followed by a down day (probability = 0.7%). Thus P(N≥5)=0.7%.

A side question here: You can't always add probabilities. It's valid in this case. Why is that? What is the condition that lets me use P(N≥5) = P(N=5) + P(N>5)?

Another side question: I implicitly used P(N>5) = P(N≥6) in the above. What let me do that?

The next step down is n=4. You can have a run of four straight days or longer with positive gains if you have a run of more than four straight days with positive gains (probability=0.7%) or if you have a run of four straight days with positive gains followed by a down day (probability = 1.4%). Thus P(N≥4)=2.1%.

You should be able to take it from here.

13. Oct 19, 2013

### anonymousk

ooooooh!!

P(N≥5)=0.7%
P(N≥4)=0,7+1,4=2,1%
P(N≥3)=2,1+5,3=7,4%
P(N≥2)=7,4+13,1=20,5%
P(N≥1)=20,5+26,7=47,2
P(N≥0)=100%

This should be right now. Thanks a lot!

To your first side question: I'm guessing you can add the probabilities in this case because they are dependent on each other. As in, day 1 has to happen before day 2 can?

And second question: Reason you can set P(N>5) = P(N≥6) is we know that all days with a positive return above 5 have a 0 probability of happening, therefore you can set them equal. Or am I grasping at straws here?

Been looking at d) and the conditional probability formula, but have no idea what numbers/probabilities to use, since we have more than 2 events/days.

EDIT:

After taking a closer look at c), I managed to calculate the mean and variance (As in, I think I've used the correct numbers.)

E(X)=0,528*0+0,267*1+0,131*2+0,053*3+0,014*4+0,007*5+0*6=0,779

Var(X)=0,528*(0-0,779)^2+0,267*(1-0,779)^2+0,131*(2-0,779)^2+0,053*(3-0,779)^2+0,014*(4-0,779)^2+0,007*(5-0,779)^2+0,0*(6-0,779)^2=1,060159

And the standard deviation should be the squared root of the variance, the squared root of 1,060159.

However, I'm still stuck on d)

Last edited: Oct 19, 2013
14. Oct 19, 2013

### D H

Staff Emeritus
Correct on (c), and my first side question. More generally, you can add probabilities of two events if and only if the two events are mutually exclusive. Another way to look at it is that the events represent disjoint, or non-intersecting, sets. Probability and statistics are difficult topics if you don't understand basic set theory. It helps a lot to know at least a bit about set theory.

So why can you equate P(N>5) with P(N≥6)? They're the same set!

With regard to (d), whenever you see the word GIVEN you should think conditional probability or Bayes' theorem. You probably haven't been taught the latter yet, so you should think conditional probability. The probability of some event B given that some event A has occurred is the probability of the intersection of the two events divided by the probability of event A: $P(B|A) = \frac{P(A\cap B)}{P(A)}$.

Question (d) asks you to compute the probability of a positive return given that you had positive returns on the previous n days. You were also given a hint to use the results from question (c) to answer this question.

What are the events A and B that will help attack this problem, and what do they have to do with the results from question (c)?

15. Oct 19, 2013

### anonymousk

aah.. I'm thinking P(A∩B) is the added values of the numbers I've been told to calculate from c), so:

47,2+20,5+7,4+2,1 = 77,2
P(A∩B∩C∩D) = P(n(1)∩n(2)∩n(3)∩n(4))=77,2.

$P(B|A) = \frac{77,2}{P(A)}$.

And P(A) is in this case the probability for the 5th day? Hmpf, I've probably gone completely off track again. Having a hard time understanding this question aswell.

16. Oct 19, 2013

### D H

Staff Emeritus
You are asked to solve that conditional probability for n=1, 2, 3, and 4. Those are four different probabilities. You'll have to compute each one separately.

Question #1: For some specific value of n, what does "GIVEN/granted that the share has had positive returns the previous n days" mean?

This is what I labeled as event "A" in my previous post. In particular, what does it mean with regard to your answer to question (c), and given that, what is the probability of event A? Note: This will depend on the value of n.

Question #2: What is your event B?

You are looking to calculate the probability of at least one more day with a positive gain given that you have already had n successive days of positive gains. In particular, what does this mean with regard to your answer to question (c)?

Question #3: What is the intersection between events A and B?

This is a bit of a trick question. The answer is easy once you think about it. A Venn diagram may be of help.

Question #4: What is "the probability for a positive return on any given workday in the investment of shares GIVEN/granted that the share has had positive returns the previous n days"?

This should be easy if you have answered the above questions. Note that this is the question you are supposed to answer. Notice how it helps to break the question down into little parts.

17. Oct 19, 2013

### anonymousk

I would normally call myself above average in math, but I'm just having a really hard time wrapping my head around this.

#1: I'm thinking event A in this case (when the value of n=1) is P(N=1)=26,7, then number we are given to start with.
#2: and event B is from the answer in c), P(N≥1)=47,2
#3: intersection would be the chances of both happening at the same time, right? hmm.

18. Oct 19, 2013

### D H

Staff Emeritus
That is absolutely the wrong event. The probability that the next day will realize a positive gain is identically zero given the event that you cited. Remember that those events (the ones with probabilities of 26.7%, 13.1%, 5.3%, etc.) were defined as being n successive days of positive gains followed by a day of negative returns.

This is not event B. This is event A for n=1. Event A for n=1 is the even that "the share has had positive returns the previous 1 days". It might continue to show positive returns. You, the investor, want it to continue to show positive returns. Event A for some value n is the random variable N having a value greater than or equal to n, or N≥n.

So given that this is event A, what is event B? Hint: The answer to this question also lies in the answer to question (c).

Last edited: Oct 19, 2013
19. Oct 19, 2013

### anonymousk

Event B is the probability for the following day ,P(N≥2)=20,5 then I believe.
Still lost on the intersection though.

EDIT: It's 4am here, and brain is barely functioning anymore, so if I stop replying, it's because I've gone to sleep, but thanks again for all your help (and patience).

2nd EDIT: Hmm wait, could the intersection be 47,2-20,5?

Last edited: Oct 19, 2013
20. Oct 19, 2013

### D H

Staff Emeritus
Correct on event B.

How to visualize the intersection? On the one hand, event A represents one or more days of consecutive positive gains, and on the other, event B represents two or more days of consecutive positive gains. Think about the way you answered question (c). There's a definite relationship between this event A and event B. Draw a Venn diagram. They help some people a lot. Venn diagrams can be of considerable aid if you are one of those who think visually (right brain dominant).