Normalization: discrete vs. continuous

In summary, the conversation discusses a continuous roulette wheel and the issue of probability density for a continuous distribution. The confusion arises from the fact that, in theory, the probability of a specific point in a continuous set is zero. However, for a probability distribution to be valid, the intervals must have a non-zero width. This is where the concept of probability density comes in. The conversation also addresses the difference between discrete and continuous sets, where continuous sets allow for the definition of intervals without changing the mathematical properties of the set.
  • #1
Pythagorean
Gold Member
4,400
312
So, I'm taking an EE class and my teacher is terribly handwavy. She couldn't really explain this to me (not homework, lecture). I detect a fundamental problem in the math, coming from a science background, but it could just be my ignorance:

Here's her lecture:

physical setup: a continuous roulette wheel returns a random variable: o < x < 2pi

normalization:

int{Pdx} = 1, the x range is 2*pi, so for the total area to equal one, the probability is constantly 1/(2*pi) for every value of x.

here is where my red flag goes up. If x is truly continuous, wouldn't the probability of hitting any particular value of x be 0 since there are infinite values of x between 0 and x?

This implies to me, that x isn't continuous and that there is actually some delta-x instead of dx.

What is my issue here?
 
Physics news on Phys.org
  • #2
Pythagorean said:
This implies to me, that x isn't continuous and that there is actually some delta-x instead of dx.

What is my issue here?

A roulette wheel represents a set of discrete outcomes. If you know calculus, you know about continuous functions and evaluating the integral

[tex]\int_{a}^{b}(x)dx=F(b)-F(a)[/tex].

Clearly if F(b)=F(a), the integral equals zero. So for a continuous probability distribution, you can't have a non zero probability of a point. What you can have is a probability density between two distinct points on a probability density function (PDF). Note one point can be at infinity depending on the PDF. If both points are at infinity, then the probability density is 1 if the PDF is defined over that range.

However, for a continuous uniform distribution such as your roulette wheel with an infinite number of points, there can't be points at infinity since every equal interval would have to have the same probability density and that would be zero if there were points at infinity. (Points at infinity is just a term for a limit at infinity.)

Therefore, you can define your intervals as small as you wish, but to define a non zero probability density the intervals must be non zero and the number of intervals must be finite if each interval has the same (non zero) probability density.

So for example your wheel of [0,[tex]2\pi[/tex]] radians would be recalibrated to [0.1] broken down to n equal intervals, each with a probability density of 1/n. You can see why n cannot be infinite.
 
Last edited:
  • #3
Ok, that's what I thought. But since the normalization leads to 1/(2pi) doesn't that determine our intervals? Which is kind of difficult since it's not an inverse integer.

Also, I'm still confused why we call it continuous when it's not.
 
  • #4
Pythagorean said:
Ok, that's what I thought. But since the normalization leads to 1/(2pi) doesn't that determine our intervals? Which is kind of difficult since it's not an inverse integer.

In terms of radian measure the width of your intervals will be 2pi/(n) but your probabilities are based on the the entire space having an integral of one, so you need to transform from radian measure to probability measure.


Also, I'm still confused why we call it continuous when it's not.

The interior measure of the intervals is still continuous, but your probability density measure applies to the width of the interval. It's continuous in the sense that you define the intervals [a,b] any way you want, whereas with a discrete countable set you're sort of "stuck" with what you have.

EDIT: With a set of n discrete elements, you cannot divide the elements. If you change the number of elements, you change the cardinality of the set. All continuous sets have the same cardinality, so you are free to define intervals without changing the mathematical properties of the set
 
Last edited:
  • #5


I understand your concern and it is important to clarify the concept of normalization in relation to discrete and continuous variables. In simple terms, normalization is the process of adjusting values to a common scale in order to make meaningful comparisons. In the context of probability, normalization ensures that the total probability of all possible outcomes is equal to 1.

In the case of a continuous variable, such as the roulette wheel in your example, the probability of hitting any particular value of x is indeed 0. However, this does not mean that the variable is not continuous. It simply means that the probability of landing on a specific value is infinitely small, but still exists. This is because in a continuous system, there are infinite possible values between any two points. Therefore, the probability of landing on any single value is infinitesimally small.

The concept of normalization takes this into account by considering the probability of a range of values rather than a single value. In the case of the roulette wheel, the probability of landing on any specific range of values is still small but not infinitesimal. For example, the probability of landing on a range of values between 0 and 1 is greater than the probability of landing on a single value of 0.5.

In contrast, for a discrete variable, such as a coin flip, the probability of landing on a specific value is not infinitesimal but rather a finite value. This is because there are a limited number of possible outcomes. In this case, normalization is achieved by dividing the total number of possible outcomes by the total number of outcomes in the specific range.

Therefore, your teacher's explanation of normalization in the context of a continuous variable is correct. The probability is indeed constantly 1/(2*pi) for every value of x, as this takes into account the infinite possibilities between 0 and 2pi. I hope this helps to clarify the concept of normalization for you.
 

1. What is the difference between discrete and continuous normalization?

Discrete normalization refers to the process of transforming data that is measured in distinct categories or intervals, such as whole numbers, into a standard scale. Continuous normalization, on the other hand, involves transforming data that is measured on a continuous scale, such as decimals or fractions, into a standard scale.

2. Why is normalization important in scientific research?

Normalization is important because it allows for comparisons to be made between different sets of data. By transforming data into a standard scale, it eliminates any differences in units or scales that may exist and allows for more accurate and meaningful analysis.

3. How is normalization typically performed?

The most common method of normalization is called z-score normalization. This involves subtracting the mean of the data from each data point and then dividing by the standard deviation. This results in a normalized value with a mean of 0 and a standard deviation of 1.

4. Are there any limitations or drawbacks to normalization?

One potential limitation of normalization is that it assumes a normal distribution of the data. If the data is not normally distributed, it may not be appropriate to use z-score normalization. Additionally, normalization can sometimes lead to the loss of information or outliers in the data.

5. How does normalization relate to other statistical techniques like standardization and scaling?

Normalization, standardization, and scaling are all related techniques used to transform data into a standard scale. While normalization and standardization are similar and often used interchangeably, scaling typically refers to the process of converting data into a specific range, such as between 0 and 1. These techniques are all used to make data more comparable and easier to analyze.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
25
Views
5K
Replies
2
Views
751
  • Advanced Physics Homework Help
Replies
10
Views
417
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
21
Views
4K
  • Advanced Physics Homework Help
Replies
14
Views
845
  • Calculus and Beyond Homework Help
Replies
2
Views
881
  • Differential Equations
Replies
1
Views
755
  • Advanced Physics Homework Help
Replies
6
Views
1K
Back
Top