Shaping probability distribution function

In summary, the conversation discusses the governing equation for a nonlinearity that can be used to pass an incoming signal with a normal distribution through in order to make its probability density function (pdf) uniform. The solution involves using the error function (erf) and a substitution variable to transform the signal and achieve a uniform mapping. The conversation also mentions the use of Monte-Carlo simulations and the Box-Muller transformation to generate random numbers from uniform distributions.
  • #1
63
0

Homework Statement



Incoming signal has normal distribution, xmin is equal to -sigma, xman is equal to +sigma. What is the governing equation of the nonlinearity through which the signal has to be passed in order to make its pdf uniform?

Homework Equations



http://en.wikipedia.org/wiki/Normal_distribution

The Attempt at a Solution



I have already found out that the signal needs to be passed through erf(x/sqrt(2)), which is very relevant to the CDF of normal distribution. The problem is that I cannot find a mathematical proof.
 
Physics news on Phys.org
  • #2
SunnyBoyNY said:

Homework Statement



Incoming signal has normal distribution, xmin is equal to -sigma, xman is equal to +sigma. What is the governing equation of the nonlinearity through which the signal has to be passed in order to make its pdf uniform?

Homework Equations



http://en.wikipedia.org/wiki/Normal_distribution

The Attempt at a Solution



I have already found out that the signal needs to be passed through erf(x/sqrt(2)), which is very relevant to the CDF of normal distribution. The problem is that I cannot find a mathematical proof.

It's very easy. Say you have a continuous random variable X with a strictly increasing cumulative distribution F(x) on an x-interval [a,b] (possibly a = -∞ and b = +∞). The probability density of X is f(x) = (d/dx) F(x). Now look at Y = F(X); that is, for each observation x of X we let the observation of Y be y = F(x). What is the distribution of Y? For x < X < x + dx the probability is f(x)*dx, so if x <--> x and y + dy <--> x + dx, we have
P{y < Y < y + dy} = f(x)*dx. If g(y) is the probability density of Y, we therefore have g(y)*dy = f(x)*dx. But dy/dx = (d/dx) F(x) = f(x), so dy = f(x) dx, hence we must have g(y) = 1; that is, Y is uniform on (0,1).

All this is very standard in Monte-Carlo simulation, where it is used to generate samples from non-uniform distributions: we generate Y uniform on (0,1), then obtain our sample of X from x = F-1(y) (at least, in those cases where the latter function is known and not too hard to compute).

RGV
 
  • #3
RGV, thanks for your feedback. I believe I have already proved it using a non-linearly distributed substitute variable "a" as opposed to the linear variable "x". Those two variables could be unambiguously mapped from one to the other to warp the "x" axis to ultimately render the pdf uniform.

I dare to say I understand your reasoning here, I will just need a little bit of time to absorb it.

The integral of (pdf):

[itex]
f(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}(\frac{x}{\sigma})^2}
[/itex]

is (cdf):

[itex]
\frac{1}{2}erf(\frac{x}{\sqrt{2}\sigma})
[/itex]

Which is a normalized function with zero mean. I assume the mapping function needs to be multiplied by [itex]\sigma[/itex] to achieve the desired uniform mapping. This is just a detail.

Quite interesting info about Monte-Carlo simulations. It would make sense to transform a linear distribution to resemble other kinds of distributions. I wonder whether MatLab does it the same way.
 
  • #4
SunnyBoyNY said:
Quite interesting info about Monte-Carlo simulations. It would make sense to transform a linear distribution to resemble other kinds of distributions. I wonder whether MatLab does it the same way.
I'm not sure how Matlab does it, but a common way to generate Gaussian (normal) random numbers from uniform ones is the following trick: if [itex]u_1[/itex] and [itex]u_2[/itex] are independent random variables, uniformly distributed over (0,1], then
[tex]n_1 = \sqrt{-2 \log(u_1)} \cos(2\pi u_2)[/tex]
and
[tex]n_2 = \sqrt{-2 \log(u_1)} \sin(2\pi u_2)[/tex]
are independent Gaussian random variables with zero mean and unit variance. This is the so-called Box-Muller transformation:

http://en.wikipedia.org/wiki/Box–Muller_transform
 
  • #5
jbuniniii,

Thanks for the info about the transform. That's why I love this forum. One question brings together many ideas and perspectives.
 

1. What is a probability distribution function (PDF)?

A probability distribution function (PDF) is a mathematical function that describes the probabilities of all possible outcomes of a random variable. It shows the likelihood of a particular outcome occurring in a given set of data.

2. How is a PDF different from a probability mass function (PMF)?

A probability mass function (PMF) is used for discrete variables, while a probability distribution function (PDF) is used for continuous variables. A PMF assigns probabilities to individual values, while a PDF assigns probabilities to intervals of values.

3. What is the relationship between a PDF and a cumulative distribution function (CDF)?

The cumulative distribution function (CDF) is the integral of the PDF and represents the probability that a random variable takes on a value less than or equal to a given value. In other words, the CDF shows the accumulated probability up to a certain point, while the PDF shows the probability at a specific point.

4. How can a PDF be used in data analysis?

A PDF can be used to summarize and analyze data by showing the distribution of a random variable. It can help identify the most common values, the spread of the data, and any outliers. It is also used in statistical modeling to make predictions and estimate probabilities.

5. What are some common probability distributions used in statistics?

Some common probability distributions used in statistics include the normal distribution, binomial distribution, Poisson distribution, and exponential distribution. These distributions are used to model different types of data and are essential in many statistical analyses and machine learning algorithms.

Suggested for: Shaping probability distribution function

Back
Top