Gaussian signal, extract uniform distribution of values

In summary: I see. So, your system generates a sequence of integers between 0 and 512, but it's not truly random because it's limited to a range of 0-99. In summary, your circuit collects noise from a reversed biased diode and amplifies the signal to create a fuzzy band of noise with increasing point density. The noise has a Gaussian PDF and the power spectrum shows one prominent peak at zero and a low, but fuzzy, baseline elsewhere. The noise can be used to generate random integers between 0 and 512, but because the range is limited to 0-99, the generated numbers are not truly random.
  • #1
Mr Peanut
30
0
Hello,

From an offset zener diode breakdown circuit, I have collected a set of bytes from an ADC. The values distribute normally as integers between 0 and 1024 with a mean of 512. I would like to use the data to create a set of random integers that distribute uniformly.

So far, I have tried taking the subset of the values that lie between 100 and 999, then:

val = val/100
val = val - floor(val)
val = val * 100

This gives me a uniform distribution of values between 0 and 99 (provided I collect enough data).

Is there a better way? Perhaps one that provides more than 100 possible values.
 
Physics news on Phys.org
  • #2
You can't have a normal distribution between 0 and 1024 (not to mention a normal distribution which gives integers!). It could be approximately normal, though. What's the standard deviation?
 
  • #3
Thinking more about it: do you by any chance have a binomial distribution? If so, then forget my question about standard deviation; your distribution is already determined: [itex](n, p) = (1024, 0.5)[/itex].

If your data don't look like this, I guess you could fit them to a beta-binomial, or some other suitable discrete distribution with finite support...

Anyway, I also wanted to say that your procedure won't give you uniform random numbers anyway. They probably look uniform, but you're fooling yourself. Try computing the actual probability distribution for any two numbers from 0 to 99: I'd be extremely surprised if they turn out to be the same.

You say it's uniform "provided [you] collect enough data". Ironically, it only looks uniform if you don't collect enough data! :-)
 
  • #4
Noise from a reversed biased diode has a Gaussian PDF. In my experiments, the AC voltage leaving a reverse-biased zener sub-circuit is ~5 mA. An oscilloscope monitoring the output shows a promising fuzz. I amplify the signal using a 2-stage amplifier that magnifies the signal by ~500X and centers it at at 2.5V above ground (max ~5V, min ~0). This signal fed into a serial ADC with a 10 bit resolution over a an input range of 0 - 5 volts. I sample the output bytes in increments of 50 mS. Typically I collect 32768 samples. The results, if plotted sequentially, mimic the oscilloscope, displaying a fuzzy band of noise with increasing point density towards a horizontal trend line at 512.

I can determine the number of times each integer occurs in the 32768 data set. Plotting the integer values vs the number of occurrences per value shows a typical Gaussian distribution. The mean of the distribution is 512 (+/- 10 run to run). The standard deviation varies depending on the characteristics of the zener diode, but ~180 is typical.

Simultaneously plotting an appropriately scaled normal PDF computed for values between 0 and 1024 using the data set's mean and standard deviation convinces me that I have sampled a Gaussian distribution. The power spectrum shows one prominent peak at zero and a low, but fuzzy, baseline elsewhere. Computing means, minima, and maxima of successive dyadic splits of the data set indicates that no unwanted clumping of values is occurring.

I would like to exploit the noise and generate random numbers. To do this, I need to determine a way to create a uniform distribution that exploits the stochastic variation in the data set.

The one way I have found is to discard all values less than 100 or greater than 999. Then, removing the most significant digit from each value results in a data set with values between 0 and 99. The probability density function of the resulting data set is a uniform distribution. And, the more points in the data set, the less chatter there is in the distribution. But... the diversity of the values is limited to 0 – 99. Clearly, I can collect 10 data sets and get 0 – 999, etc.

Is there a better way? If not, I'll probably rework the circuit as a random bit generator and assemble the integers bitwise.
 
  • #5
Mr Peanut said:
Noise from a reversed biased diode has a Gaussian PDF. In my experiments, the AC voltage leaving a reverse-biased zener sub-circuit is ~5 mA. An oscilloscope monitoring the output shows a promising fuzz. I amplify the signal using a 2-stage amplifier that magnifies the signal by ~500X and centers it at at 2.5V above ground (max ~5V, min ~0). This signal fed into a serial ADC with a 10 bit resolution over a an input range of 0 - 5 volts. I sample the output bytes in increments of 50 mS. Typically I collect 32768 samples. The results, if plotted sequentially, mimic the oscilloscope, displaying a fuzzy band of noise with increasing point density towards a horizontal trend line at 512.

That helps me understand. I was confused because Gaussians have continuous output and no boundaries, but your system is discrete and bounded.

I thought the reverse-biased zener diode had Poisson noise, not Gaussian? But presumably in the range you're measuring a Gaussian would be a good approximation.

Mr Peanut said:
I can determine the number of times each integer occurs in the 32768 data set. Plotting the integer values vs the number of occurrences per value shows a typical Gaussian distribution. The mean of the distribution is 512 (+/- 10 run to run). The standard deviation varies depending on the characteristics of the zener diode, but ~180 is typical.

+-10 strikes me as too much variation. I would have expected 180 / sqrt(32768), which is more like +-1. (See standard deviation of the mean.)

Mr Peanut said:
Simultaneously plotting an appropriately scaled normal PDF computed for values between 0 and 1024 using the data set's mean and standard deviation convinces me that I have sampled a Gaussian distribution. The power spectrum shows one prominent peak at zero and a low, but fuzzy, baseline elsewhere. Computing means, minima, and maxima of successive dyadic splits of the data set indicates that no unwanted clumping of values is occurring.

I would like to exploit the noise and generate random numbers. To do this, I need to determine a way to create a uniform distribution that exploits the stochastic variation in the data set.

The one way I have found is to discard all values less than 100 or greater than 999. Then, removing the most significant digit from each value results in a data set with values between 0 and 99. The probability density function of the resulting data set is a uniform distribution. And, the more points in the data set, the less chatter there is in the distribution. But... the diversity of the values is limited to 0 – 99. Clearly, I can collect 10 data sets and get 0 – 999, etc.

Is there a better way? If not, I'll probably rework the circuit as a random bit generator and assemble the integers bitwise.

Again: using the least significant digits may look uniform, but I'd be extremely surprised if it actually is. I would expect if your experiment is highly repeatable, and you take a large number of samples, there would be a residual pattern in the frequencies. I'd expect a relative standard deviation for each number of 1/sqrt(327), or about 5%. I'm not surprised you can't detect the non-randomness at the 5% level. And sqrt() goes down rather slowly, so you'd have to take a lot of samples to see it.

Anyway... why even have the ADC? Couldn't you just take analog readings? It seems like digitizing discards some information which you could be using.

The cumulative distribution function (CDF) of each value will have a uniform distribution from 0 to 1. For discretized data such as you have, that will only be approximately true (but it may be good enough).

On the other hand, your bitwise generator idea sounds good. I'd have a lot more confidence in a 50/50 distribution from random noise. Even if you're a bit off (no pun intended!), you still have almost as much entropy. A crappy biased generator with p=0.4 (instead of p=0.5) still has 97% of the entropy per bit.
 
  • #6
Mr Peanut said:
and centers it at at 2.5V above ground (max ~5V, min ~0). This signal fed into a serial ADC with a 10 bit resolution over a an input range of 0 - 5 volts.

How stable is the above arrangement as the circuits age, change temperature, etc.? Should the mathematical technique be robust enough to work if things get out of calibration?
 
  • #7
Thanks Chogg,

Back to the circuit board for me. Bitwise seems to be the way to go.
 
  • #9
As @chogg said, if you plug your original numbers into the Cumulative Distribution Function (CDF) of the normal distribution, the resulting values should be uniform on (0,1). Unfortunately, the CDF of the normal distribution is an integral, so you may need to use a table of values. There are very accurate tables available. MATLAB has a function, normcdf, that can be used.
 
Last edited:

1. What is a Gaussian signal?

A Gaussian signal, also known as a Gaussian curve or Gaussian distribution, is a type of probability distribution that is commonly used in statistics and data analysis. It is characterized by a bell-shaped curve and is often used to model natural phenomena in the physical and social sciences.

2. How is a Gaussian signal different from a uniform distribution?

A Gaussian signal is different from a uniform distribution in that it is not evenly spread out across all values. Instead, it is centered around a mean value and tapers off towards the edges. This means that there is a higher probability of obtaining values near the mean and a lower probability of obtaining values further away from the mean.

3. Why would someone want to extract a uniform distribution of values from a Gaussian signal?

A uniform distribution of values can be useful in certain situations where a more evenly spread out set of data is needed. For example, in some data analysis techniques, a uniform distribution may be easier to work with and interpret than a non-uniform distribution like a Gaussian signal.

4. How can a uniform distribution be extracted from a Gaussian signal?

One way to extract a uniform distribution from a Gaussian signal is by using a mathematical transformation known as the Box-Muller transform. This involves taking pairs of random numbers from a uniform distribution and converting them into pairs of random numbers that follow a Gaussian distribution.

5. What are some applications of Gaussian signals and uniform distributions in science?

Gaussian signals and uniform distributions have many applications in various fields of science, including physics, biology, economics, and psychology. They can be used to model natural phenomena, analyze data, and make predictions. For example, Gaussian signals are commonly used in signal processing and communication systems, while uniform distributions are often used in statistical analysis and simulation studies.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
6K
  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
24
Views
6K
  • Other Physics Topics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
997
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
11K
Back
Top