Conditional Prob of 0 in ASCII

In summary, the probability of a 0 in an ASCII text file is 1/8. The space for a character is 2^7=256. Assuming each character is randomly distributed, other bits follow the rule Pr(0)=Pr(1)=1/2. Thus, as total (MSB+other bits), Pr(0)=1/8 +1/2 = 5/8.
  • #1
Cylab
54
0
What is prob of 0 in ASCII text file assuming it is bit string?

Analysis:
MSB(1st bit in each byte) is always 0. So Pr(MSB) = 1/8.
its space is 2^7=256. Assuming each character is randomly distribution(same ratio of appearing in text), besides MSB, other bits follow the rule Pr(0)=Pr(1)=1/2.

Thus, as total (MSB + other bits), Pr(0)=1/8 +1/2 = 5/8.
Is it correct?

Then further, randomly select, n bits from the total bits of ASCII text files.
What is probability of being 0 of t_th position bit or Pr[t = 0] ,
where 0<t <= n ? And what is prob of being 00 of t_th and (t+1)_th positions of bits (two sequences of zero).

Please anyone shed some light on this.

Thanks in advance for your attention.
 
Last edited:
Physics news on Phys.org
  • #2
Your first answer isn't correct. By your argument, the probability of a 1 is 0+1/2=1/2. Then the probability of a 0 or a 1 is p(0)+p(1)=5/8+1/2=9/8.

You need to weight your p(0|MSB) by p(MSB) and p(0|not MSB) ay p(not MSB).

I'm not sure quite what you're asking in the second part. Are you just taking n bits at random, or picking the n bits following a randomly chosen bit? If the first, are you allowing the same bit to be picked twice (or more), or no more than once?
 
  • #3
In what language? If it is an English text file, it will have a somewhat different distribution of letters that, say German, and so forth.

This is not really a problem that is amenable to any simple analysis. It will depend on the distribution of ALL of the characters. How much punctuation is used, etc.

The one thing that I can definitely say is that the probability of a zero is slightly greater than the probability of a 1 because normal text files pretty much NEVER use any of the characters above 127, so right away, you've got 1/8 of the characters always being a zero.
 
  • #4
NOTE: Assuming each character is randomly distribution.
1st. What is prob of 0 in ASCII?
2nd. Now we have n bits taken randomly from ASCII.
What is prob of t_th position bit of being 0 in the randomly taken n bits (0<t<n)?
3rd. Now taking 2 consequence bits, what is prob that they are 00?
 
  • #5
I am reading your note as meaning that each symbol in the range 0-127 is equally probable. This is not what it says. You have not specified a distribution, so phinds' comment is reasonable. I will assume a uniform distribution because I would guess that is what you mean if you don't specify - but you should.

I've given you a hint how to do the first one in my previous post.

For the second one, you need to explain you are drawing bits with replacement or without.

For the third part, you have identified two separate reasons why a bit might be zero. What are the conditional probabilities on a zero as the second bit?
 
  • #6
Random distribution means clearly that the ratio of each character is equal.
your prob 9/8... what is that? why you write prob of 1.
your writing is absolutely not related of the problem.

e.g. 2nd says one bit.. why "with repalcement or without"??
I would strong recommend you read before write...
 
  • #7
Cylab said:
Random distribution means clearly that the ratio of each character is equal.
Not in English. Random means that it is not possible to predict a result from other results. Paint one side of a cubic die red and the other five blue. Throw the die many times and record the colour of the top surface each time. The sequence of colours is random, but there will be around five times as many blue results as red ones.

You do appear to mean that each letter is equally probable. Fair enough - but precision is very important in statistics, and if you do not learn the right words, other people will not understand you and you will get responses like mine and phinds'.

Cylab said:
your prob 9/8... what is that? why you write prob of 1.
To show you that your answer was incorrect. A bit can only be a 1 or a 0, so p(0) + p(1) must equal 1. If you take the same reasoning you used to arrive at p(0)=5/8 and use it to calculate p(1), you will arrive at p(1)=1/2. That makes p(0)+p(1)=9/8. So your reasoning is wrong. I told you how to correct it in my first post. If you didn't understand me that's fine, and I will try to explain further.

Cylab said:
your writing is absolutely not related of the problem.

e.g. 2nd says one bit.. why "with repalcement or without"??
That is simply a restatement of the last sentence in my first post - a question you have not answered. Again, if you did not understand then I am happy to explain further.
 
  • #8
FWIW - ASCII defines 7 bit combinations for letters. I think you mean something else like extended ASCII.

Phinds is correct.

http://www.iana.org/assignments/character-sets google for: ANSI_X3.4-1968 (exact name for ASCII)

You can define your problem however it suits you, but you should be aware of what a standard definition of something is. So you don't confuse others.
 

1. What is "Conditional Prob of 0 in ASCII"?

"Conditional Prob of 0 in ASCII" refers to the likelihood of the character "0" appearing in a string of ASCII characters, given certain conditions or constraints.

2. How is the conditional probability of 0 in ASCII calculated?

The conditional probability of 0 in ASCII can be calculated by dividing the number of times the character "0" appears in the string by the total number of characters in the string, while taking into account any relevant conditions or constraints.

3. What are some examples of conditions or constraints that can affect the conditional probability of 0 in ASCII?

Some examples of conditions or constraints that can affect the conditional probability of 0 in ASCII include the length of the string, the presence of other characters, and the position of the character "0" within the string.

4. Why is the conditional probability of 0 in ASCII important?

The conditional probability of 0 in ASCII can provide useful insights for data analysis and prediction, particularly in fields such as computer science and linguistics. It can also help identify patterns and anomalies within a set of data.

5. How can the conditional probability of 0 in ASCII be used in practical applications?

The conditional probability of 0 in ASCII can be used in a variety of practical applications, such as text processing and analysis, error detection and correction, and natural language processing. It can also be used in data compression and encryption algorithms.

Similar threads

  • Set Theory, Logic, Probability, Statistics
2
Replies
56
Views
9K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
762
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
740
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
15
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
5K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
3K
Back
Top