Register to reply

Probability of 0 bit in ASCII text files

by Cylab
Tags: ascii, files, probability, text
Share this thread:
Cylab
#19
Oct9-12, 06:38 AM
P: 54
Still confused, say, why getting a 0 eliminates 7 of the total weight of 16 etc?.(saying it in simple: we have Pr(MSB=0)=1/8, plus Pr(0)=Pr(1)=1/2, then draw a bit, (as whole) Pr(0)=?, Pr(MSB=0)=?)
Anyway, it seems to follow the rule, where i equals number of 0 drawn sequenced/continual, say i=0,1,2.
1/8 ((i+1)/2^i +(7-i)/2^(i+1))=(i+9)/2^(i+4) .
Agree?
Cylab
#20
Oct9-12, 06:41 AM
P: 54
Quote Quote by haruspex View Post
A 0 from MSB gets 2 units while each of the other 14 get only 1 unit each.
what do you mean by unit now? is it soooooooo hard to explain in plain words?
haruspex
#21
Oct9-12, 03:50 PM
Homework
Sci Advisor
HW Helper
Thanks
P: 9,877
Let's try it using the standard rules of conditional probability.
P[bit was MSB | bit was 0] * P[ bit was 0] = P[bit was MSB & bit was 0] = P[bit was MSB] = 1/8.
P[ bit was 0] = 9/16 (you already proved)
So P[bit was MSB | bit was 0] = (16/9)*(1/8) = 2/9
Cylab
#22
Oct10-12, 03:19 AM
P: 54
Quote Quote by haruspex View Post
So P[bit was MSB | bit was 0] = (16/9)*(1/8) = 2/9
Thanks for being patient!!! I understood now. Thanks again.

So you agree that the Prob of drawing a 0 bit in ASCII is Pr(0)=9/16 (given each bit is randomly distributed except MSB)? Then, I still wonder about Pr[0|0]=? (I try to say that Prob of drawing a 2nd 0, given 1st was 0 too). Because I was told 9/16*9/16 comes close but not correct.
haruspex
#23
Oct10-12, 12:41 PM
Homework
Sci Advisor
HW Helper
Thanks
P: 9,877
Quote Quote by Cylab View Post
I still wonder about Pr[0|0]=? (I try to say that Prob of drawing a 2nd 0, given 1st was 0 too). Because I was told 9/16*9/16 comes close but not correct.
The prob of drawing a second 0, given the first was a 0, wouldn't be close to 9/16*9/16. It would be close to 9/16. 9/16*9/16 would be close to the probability for drawing two zeroes, with no prior information. So why isn't 9/16 exactly right? Because the first 0 was not replaced, which might mean there is a smaller number of MSBs available to choose. To figure out how that affects things, you need to calculate the probability that the first was an MSB. We have now done that: 2/9. For the next step see post #12.
Cylab
#24
Oct10-12, 11:02 PM
P: 54
I was told it is 10/32 (drawing two zeroes with prior info), and it follows the rule of 1/8 ((i+1)/2^i +(7-i)/2^(i+1) ). Just can`t figure out.
haruspex
#25
Oct11-12, 02:45 AM
Homework
Sci Advisor
HW Helper
Thanks
P: 9,877
Quote Quote by Cylab View Post
I was told it is 10/32 (drawing two zeroes with prior info), and it follows the rule of 1/8 ((i+1)/2^i +(7-i)/2^(i+1) ). Just can`t figure out.
That says you didn't state the problem correctly. The formula works if the bits are drawn consecutively from the file. Suppose you draw two consecutive bits. prob that one is an MSB is 1/4, so prob of two zeroes is 1/4 * 1/2 + 3/4 * 1/2 * 1/2 = 5/16.
You already know prob that first was a 0 was 9/16. Using the joint probability rule, you can easily calculate the prob that second is a zero given the first was. Do you see how?
Cylab
#26
Oct11-12, 04:18 AM
P: 54
Quote Quote by haruspex View Post
That says you didn't state the problem correctly. The formula works if the bits are drawn consecutively from the file. Suppose you draw two consecutive bits. prob that one is an MSB is 1/4, so prob of two zeroes is 1/4 * 1/2 + 3/4 * 1/2 * 1/2 = 5/16.
You already know prob that first was a 0 was 9/16. Using the joint probability rule, you can easily calculate the prob that second is a zero given the first was. Do you see how?
Sorry for the statement, the bits are drawn consecutively (successive bits). So the problem is about Prob of drawing a 0 bit, 2 successive 0 bits(00) and 3 successive 0 bits(000) respectively. I just couldn`t make correct calculation, sorry.
PLEASE!!!!!!!!!!!
haruspex
#27
Oct11-12, 11:50 AM
Homework
Sci Advisor
HW Helper
Thanks
P: 9,877
OK. First we need the probability that N successive bits are all zero. For now, assume N < 9. What is the probability that the N include an MSB? N/8, right?
If they do include an MSB, what is the prob they are all 0? There must be exactly one MSB, so it's 2-(N-1) ok? And if they don't, it's 2-N. So in total 2-N(1+N/8).
Now suppose we know the first N-1 were all zero and we want the prob that the Nth is too. By the conditional/joint probability rule, we can just take the ratio of two of these probs:
2-N(1+N/8)/(2-(N-1)(1+(N-1)/8)) = (8+N)/(2(7+N))
Do you also need to investigate N > 8?
Cylab
#28
Oct18-12, 04:34 AM
P: 54
Thanks, let me understand your explanation first. I think it takes sometime!
Cylab
#29
Nov22-12, 05:12 AM
P: 54
Quote Quote by haruspex View Post
OK. First we need the probability that N successive bits are all zero. For now, assume N < 9. What is the probability that the N include an MSB? N/8, right?
1st: the Prob of having 3 successive 0 bits(000) is ;
3/8*1/2 + 5/8*1/2*1/2 = 11/32.

Since you computed of having 2 successive 0 bits is 10/32.
So my calculation of (000) seems wrong?

2nd : if it follows to the rule of (2^-N)(1+N/8),
Then 5 successive 0 bits(00000) would be:
2^5(1+5/8)= 13/512

So my calculation of (000) seems wrong.
Will you please share your thoughts?
haruspex
#30
Nov22-12, 03:04 PM
Homework
Sci Advisor
HW Helper
Thanks
P: 9,877
Quote Quote by Cylab View Post
1st: the Prob of having 3 successive 0 bits(000) is ;
3/8*1/2 + 5/8*1/2*1/2 = 11/32.
How do you get that? Should be 3/8*1/2*1/2 + 5/8*1/2*1/2*1/2 = 11/64
2nd : if it follows to the rule of (2^-N)(1+N/8),
Then 5 successive 0 bits(00000) would be:
2^5(1+5/8)= 13/512
No, 13/256
Cylab
#31
Nov25-12, 03:34 AM
P: 54
Quote Quote by haruspex View Post
How do you get that? Should be 3/8*1/2*1/2 + 5/8*1/2*1/2*1/2 = 11/64

No, 13/256
You are right. Thanks.
Please let me confirm about the "1/2".
1st.
Analysis: The number of "1/2 " at the right part (of +) is 2^-(N-1), and left is 2^-N.
right? but Where the " number of 1/2 " comes from? Is that because of the N that is number of successive zero bits taken from ASCII?
2nd.
If N>8, then we should assume it is impossible. In other words, the answer should be 0. Am I correct?
haruspex
#32
Nov25-12, 02:05 PM
Homework
Sci Advisor
HW Helper
Thanks
P: 9,877
Quote Quote by Cylab View Post
Where the " number of 1/2 " comes from? Is that because of the N that is number of successive zero bits taken from ASCII?
Yes. If N ≤ 8, one of the N might be a leading bit, but only one.
With probability N/8, there is a leading bit. That bit will be 0. The other N-1 may be 0 or 1, equally likely. So prob that all N are 0 is 2-(N-1).
With prob 1-N/8, there is no leading bit. All N may be 0 or 1, equally likely. So prob that all N are 0 is 2-N.
Adding up: (N/8)*2-(N-1) + (1-N/8)*2-N = 2-N(1+N/8).
If N>8, then we should assume it is impossible. In other words, the answer should be 0.
Not at all. You are now guaranteed one leading 0, and the question becomes whether there's 1 or 2. In general, if we write N = 8A+B, B < 8, what do you think the formula would be?
Cylab
#33
Nov25-12, 06:44 PM
P: 54
Quote Quote by haruspex View Post
Adding up: (N/8)*2-(N-1) + (1-N/8)*2-N = 2-N(1+N/8).
Thanks. the explanations are really helpful.
So in other words,
「We can assume that there are seven bits (or N<8) and that there is at most one MSB bit (which means either one MSB or zero MSB) in it. Thus we can compute the probability of having zero MSB, plus the probability of having one MSB. right ?」


Quote Quote by haruspex View Post
Not at all. You are now guaranteed one leading 0, and the question becomes whether there's 1 or 2. In general, if we write N = 8A+B, B < 8, what do you think the formula would be?
Analysis:
1st. as the same token, If we let N be 15bits , then the it must contain at least one MSB or at most two MSBs. We add the probability of having one MSB and having two MSBs in it.
2nd. further, when N is 37 bits, then it must contain at least four or at most five MSBs. And so on ...
So what it turns out to be? Just replacing N with 8A+B seems not sufficient for the computations, unless I understand what 'A', 'B' stand for.
Will you furnish some explanations further please?
haruspex
#34
Nov25-12, 07:30 PM
Homework
Sci Advisor
HW Helper
Thanks
P: 9,877
Quote Quote by Cylab View Post
1st. as the same token, If we let N be 15bits , then the it must contain at least one MSB or at most two MSBs. We add the probability of having one MSB and having two MSBs in it.
How about you complete that and post the formula you get?
2nd. further, when N is 37 bits, then it must contain at least four or at most five MSBs. And so on ...
So what it turns out to be? Just replacing N with 8A+B seems not sufficient for the computations, unless I understand what 'A', 'B' stand for.
A and B are integers, 0 ≤ B < 8, N=8A+B. That completely defines A and B. A = integer part of N/8; B = N modulo 8.
Cylab
#35
Nov25-12, 08:37 PM
P: 54
Quote Quote by haruspex View Post
How about you complete that and post the formula you get?

A and B are integers, 0 ≤ B < 8, N=8A+B. That completely defines A and B. A = integer part of N/8; B = N modulo 8.
OK! But the analysis concentrates on MSB only (say, N=7); so the formula maybe like;
P(MSB)=P(MSB |H0)P(H0) + P(MSB |H1)P(H1)
(Let Hi be the event that there are i MSB bits in N, for i = 0, 1, 2, 3….. )
where P(MSB |H0) stands for conditional probability of MSB bit in N given it is H0 which equals 0 (no MSB) and P(MSB |H1) stands for conditional probability of MSB bit in the N given it is H1 which equals 1/7 (one MSB);

But it is little different from the point of N=8A+B. right?
I could not figure out the computation.
haruspex
#36
Nov25-12, 11:08 PM
Homework
Sci Advisor
HW Helper
Thanks
P: 9,877
Quote Quote by Cylab View Post
the analysis concentrates on MSB only (say, N=7); so the formula maybe like;
P(MSB)=P(MSB |H0)P(H0) + P(MSB |H1)P(H1)
(Let Hi be the event that there are i MSB bits in N, for i = 0, 1, 2, 3.. )
where P(MSB |H0) stands for conditional probability of MSB bit in N given it is H0 which equals 0 (no MSB) and P(MSB |H1) stands for conditional probability of MSB bit in the N given it is H1 which equals 1/7 (one MSB);
That's all true, but I'm not sure where it gets you. What are you trying to solve here?
Are we done with the N≤8 case?


Register to reply

Related Discussions
How read text files on CASIO fx-9860GII SD? Calculators 2
Several ascii files, read, write, average? Programming & Computer Science 5
Read multiple binary files to ascii Programming & Computer Science 0
Fortran 90 question about reading files with text Programming & Computer Science 1
Indexing a lot of text/pdf files Computing & Technology 1