New Reply

Probability of 0 bit in ASCII text files

 
Share Thread
Nov25-12, 08:37 PM   #35
 

Probability of 0 bit in ASCII text files


Quote by haruspex View Post
How about you complete that and post the formula you get?

A and B are integers, 0 ≤ B < 8, N=8A+B. That completely defines A and B. A = integer part of N/8; B = N modulo 8.
OK! But the analysis concentrates on MSB only (say, N=7); so the formula maybe like;
P(MSB)=P(MSB |H0)P(H0) + P(MSB |H1)P(H1)
(Let Hi be the event that there are i MSB bits in N, for i = 0, 1, 2, 3….. )
where P(MSB |H0) stands for conditional probability of MSB bit in N given it is H0 which equals 0 (no MSB) and P(MSB |H1) stands for conditional probability of MSB bit in the N given it is H1 which equals 1/7 (one MSB);

But it is little different from the point of N=8A+B. right?
I could not figure out the computation.
Nov25-12, 11:08 PM   #36
 
Recognitions:
Homework Helper Homework Help
Science Advisor Science Advisor
Quote by Cylab View Post
the analysis concentrates on MSB only (say, N=7); so the formula maybe like;
P(MSB)=P(MSB |H0)P(H0) + P(MSB |H1)P(H1)
(Let Hi be the event that there are i MSB bits in N, for i = 0, 1, 2, 3….. )
where P(MSB |H0) stands for conditional probability of MSB bit in N given it is H0 which equals 0 (no MSB) and P(MSB |H1) stands for conditional probability of MSB bit in the N given it is H1 which equals 1/7 (one MSB);
That's all true, but I'm not sure where it gets you. What are you trying to solve here?
Are we done with the N≤8 case?
Nov26-12, 12:20 AM   #37
 
Quote by haruspex View Post
That's all true, but I'm not sure where it gets you. What are you trying to solve here?
Are we done with the N≤8 case?
How about investigation of N > 8 case?
Just can`t quiet figure out how N=8A+B works.
Nov26-12, 02:34 PM   #38
 
Recognitions:
Homework Helper Homework Help
Science Advisor Science Advisor
Quote by Cylab View Post
How about investigation of N > 8 case?
Just can`t quiet figure out how N=8A+B works.
N=8A+B, B<8, means there are A whole bytes and B odd bits, so either A or A+1 MSBs included.
See what gaps you can fill in here:
Prob that this includes A+1 MSB's = ....?; if A+1 MSBs, prob that all N bits are 0 is ...?
Prob that this includes A MSB's = ....?; if A MSBs, prob that all N bits are 0 is ...?
Adding this up, prob that all N bits are 0 is ...?
Nov27-12, 05:48 AM   #39
 
Quote by haruspex View Post
N=8A+B, B<8, means there are A whole bytes and B odd bits, so either A or A+1 MSBs included.
See what gaps you can fill in here:
Prob that this includes A+1 MSB's = ....?; if A+1 MSBs, prob that all N bits are 0 is ...?
Prob that this includes A MSB's = ....?; if A MSBs, prob that all N bits are 0 is ...?
Adding this up, prob that all N bits are 0 is ...?
Quote by haruspex View Post
if A+1 MSBs, prob that all N bits are 0 is ...?
That is not easy for me Sir.
Analysis: Since the N bits are drawn consecutively from ASCII, there is only 1 character (out of 2^256), which is are all 0. So only 1 MSB. Thus, the Prob =1/2^256.
Others seem to follow the conception , or I misunderstood your point?

Question:
1) Pr[0] in ASCII (assume each character appears with same ratio) equals = 1/8+1/2=5/8. Is it OK?
2) Successive 7 bits are drawn at random from ASCII bits (e.g. no bias of character distribution), what is Pr[0] in the 7 bits?
Successive 4 bits are drawn (same condition with above), what is Pr[0] in the 4 bits?
So, say, Successive N bits are drawn (same condition) , what is Pr[0] in the N bits?
Analysis:
Do you think it is same case? remember you explained that N/8*2^-N + (1-N/8)*2^-(N+1) . Does the formula apply to the case of 2).

Shed some lights on please.
Nov27-12, 02:26 PM   #40
 
Recognitions:
Homework Helper Homework Help
Science Advisor Science Advisor
Quote by Cylab View Post
Analysis: Since the N bits are drawn consecutively from ASCII, there is only 1 character (out of 2^256), which is are all 0. So only 1 MSB. Thus, the Prob =1/2^256.
The N bits might start in the middle of one byte, span several whole bytes, and finish part way through the last. For each whole byte, prob of all zeroes is 1/128 (since MSB always zero).
If the N bits include A MSBs then how many non-MSBs do they include?
What is the prob that the A MSBs are all 0?
What is the prob that the non-MSBs are all 0?
So what is the prob that all N bits are 0?
Question:
1) Pr[0] in ASCII (assume each character appears with same ratio) equals = 1/8+1/2=5/8. Is it OK?
No. 1/8 that bit is MSB, so 7/8 that it is non-MSB. P[0] = 1/8 + 7/8*1/2 = 9/16. We went through that much earlier in the thread.
2) Successive 7 bits are drawn at random from ASCII bits (e.g. no bias of character distribution), what is Pr[0] in the 7 bits?
Successive 4 bits are drawn (same condition with above), what is Pr[0] in the 4 bits?
So, say, Successive N bits are drawn (same condition) , what is Pr[0] in the N bits?
Analysis:
Do you think it is same case? remember you explained that N/8*2^-N + (1-N/8)*2^-(N+1) . Does the formula apply to the case of 2).
I believe you are misquoting the formula. In post #27 I explained that the prob of N consecutive 0 bits (N≤8) is 2-N(1+N/8).
Dec1-12, 12:55 AM   #41
 
Quote by haruspex View Post
I believe you are misquoting the formula. In post #27 I explained that the prob of N consecutive 0 bits (N≤8) is 2-N(1+N/8).
That is right and it was good explanation.
Now, say, two N consecutive bits are taken (or two groups) (N1=7, and N2=4).
So the prob is different in following cases in comparing of same amount of 0`s?
1st case: (in N1=7) Pr[0], Pr[00],..,Pr[0000] =?
2nd case: (in N2=4) Pr[0], Pr[00],..,Pr[0000] =?
Dec1-12, 02:17 PM   #42
 
Recognitions:
Homework Helper Homework Help
Science Advisor Science Advisor
Quote by Cylab View Post
1st case: (in N1=7) Pr[0], Pr[00],..,Pr[0000] =?
2nd case: (in N2=4) Pr[0], Pr[00],..,Pr[0000] =?
I don't understand your question. What does Pr[0] mean in the context of N1=7? Is it the probability that the next bit is zero given the preceding 7 were?
Dec1-12, 10:15 PM   #43
 
Quote by haruspex View Post
I don't understand your question. What does Pr[0] mean in the context of N1=7? Is it the probability that the next bit is zero given the preceding 7 were?
Sorry!
I meant the prob of 0 within N1 =7 consecutively drawn from ASCII .
In other words, say, now we have a group of bits consists of many N1, each of which is consecutive 7 bits drawn from ASCII. what is Prob[0], Prob[00] in the group respectively?

2nd case. another group of bits with same condition, where N2=4. What is Prob[0], Prob[00] within the second group respectively?

Is (N1)Prob[0] = (N2)Prob[0] right, or should it be unequal?
Dec2-12, 12:18 AM   #44
 
Recognitions:
Homework Helper Homework Help
Science Advisor Science Advisor
Quote by Cylab View Post
Is (N1)Prob[0] = (N2)Prob[0] right,
Of course. The bits don't know how many others were chosen.
Dec2-12, 12:34 AM   #45
 
Quote by haruspex View Post
Of course. The bits don't know how many others were chosen.
So you are saying following are correct?

(N1=7)Prob[0] = (N2=4)Prob[0]
(N1=7)Prob[00] = (N2=4)Prob[00]
(N1=7)Prob[000] = (N2=4)Prob[000]
....
Dec2-12, 02:10 PM   #46
 
Recognitions:
Homework Helper Homework Help
Science Advisor Science Advisor
To be completely clear:
If you choose N consecutive bits, the probability that the first R of those bits are all zero (R <= N), depends only on R. It cannot depend on N.
Further, if you choose N consecutive bits, then choose R consecutive bits from those N, the probability that the first R of those bits are all zero depends only on R.
(This seems so obvious that I worry that I have not understood the question.)
Dec2-12, 08:40 PM   #47
 
Quote by haruspex View Post
To be completely clear:
If you choose N consecutive bits, the probability that the first R of those bits are all zero (R <= N), depends only on R. It cannot depend on N.
Further, if you choose N consecutive bits, then choose R consecutive bits from those N, the probability that the first R of those bits are all zero depends only on R.
(This seems so obvious that I worry that I have not understood the question.)
case: N1=7 & N2=4 . Assume N is taken from X bits, which is ASCII.
take R=2 bits from N1 & N2 respectively, what is prob that they are two 0 bits.

1st. (N1 case) : {(9X/16)C2 * (7X/16)C5 } / xC7.
1st. (N2 case) : {(9X/16)C2 * (7X/16)C2 } / xC4.

Seems it depends on N too.
Dec2-12, 08:52 PM   #48
 
Recognitions:
Homework Helper Homework Help
Science Advisor Science Advisor
Quote by Cylab View Post
1st. (N1 case) : {(9X/16)C2 * (7X/16)C5 } / xC7.
I have no idea what that notation means.
Dec2-12, 09:03 PM   #49
 
Quote by haruspex View Post
I have no idea what that notation means.
Hypergeometric Distribution.

X: number of ASCII bits , from which N is taken.
Pr[0] = 9/16.

xC7 : The number of combinations of x , taken 7 at a time.
Dec2-12, 09:11 PM   #50
 
Recognitions:
Homework Helper Homework Help
Science Advisor Science Advisor
Quote by Cylab View Post
Hypergeometric Distribution.

X: number of ASCII bits , from which N is taken.
Pr[0] = 9/16.

xC7 : The number of combinations of x , taken 7 at a time.
So how do I read (9X/16)C2? If I plug in X=7, that gives (63/16)C2, which is meaningless.
Dec2-12, 09:54 PM   #51
 
Quote by haruspex View Post
So how do I read (9X/16)C2? If I plug in X=7, that gives (63/16)C2, which is meaningless.
C: combinations
Pr[0] = 9/16.
X: number of ASCII bits , from which N is taken.
case: N1=7 & N2=4 . Assume N is taken from X bits, which is ASCII.
Other definitions should be clear
New Reply

Similar discussions for: Probability of 0 bit in ASCII text files
Thread Forum Replies
How read text files on CASIO fx-9860GII SD? Calculators 2
several ascii files, read, write, average? Programming & Comp Sci 5
Read multiple binary files to ascii Programming & Comp Sci 0
Fortran 90 question about reading files with text Programming & Comp Sci 1
indexing a lot of text/pdf files Computing & Technology 1