Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Calculating permutations for a normally distributed variable

  1. Jun 6, 2015 #1
    For three dice, you can have 6 * 6 * 6 = 216 permutations (order matters). The dice has a uniform probability distribution of p(x) = 1/6. Easy.

    I'm trying to get an estimate of how many permutations you can have if a variable has a normal probability distribution. So for example, if a variable has mean 100, standard deviation 5, how many permutations can I get if I measure the variable 3 times repeatedly? I suspect that someone's somehow going to include a %age in their answer...
     
    Last edited: Jun 6, 2015
  2. jcsd
  3. Jun 6, 2015 #2

    andrewkirk

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    There will be infinitely many permutations because there are infinitely many values to be permuted, as the normal distribution is a continuous distribution - so the value of a draw from the normal dist is a continuous random variable (RV). Compare this to the dice, where the RV has only six different possible values - so we call it a 'discrete' RV.

    This can be fixed by rounding the normal RV to the nearest integer, but even that isn't enough, as there is no limit on the size of a random draw. Even with the dist you gave, there is an indescribably small, yet nonzero chance of a draw of 1 billion.

    To remove that, you need to limit the values the RV can take both above and below.

    For instance, if you define RV ##X## by ##X=round(\max(50,\min(150,100+5\cdot Z)))## where ##Z\sim N(0,1)## (ie ##Z## is a standard normal RV), then you can ask the question about ##X##. But the answer will be very boring. The number of permutations is simply ##101^n## where ##n## is the number of draws. Note that 101 is the number of different possible values of ##X##, although values outside the range ##85-115 will be very rare.

    What would be more interesting would be the frequency histogram of the different combinations that can be drawn.
     
  4. Jun 7, 2015 #3

    You're absolutely right - I know that the distribution is discrete. I just wrote normal for unknown reasons :confused:. I think it's actually a Poisson, and I'll elaborate...
    I threw in the dice example because I cross posted this question on another site that's pretty useless for answering my statistics questions. What I meant to say is that I've taken the RGB values of a single pixel, and tallied them over 10,000 frames from a webcam under static conditions. The values I got are as follows:-
    upload_2015-6-8_1-1-36.png
    I want to know how many permutations are possible for the unique values of that pixel. This would equate to how many unique colours that pixel might be. So you say that I need to set some hard limits on the values each channel can take - you seem to have implied +/- 3 standard deviations in your calculation?

    Side question - how could I generate your suggested frequency histogram of the different combinations that can be drawn?
     
  5. Jun 7, 2015 #4

    andrewkirk

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    I doubt the distribution can be Poisson, as that distribution has no upper limit, and colour values in digitised video have an upper limit, since they are denoted by a fixed number of bits.

    If the potential colour values for each of R, G and B are the integers from 0 to ##2^n-1## (so if it's 24-bit colour, n is 8 and there are 256 different potential values for each of R G and B) then there are ##2^{3n}## permutations (unique colours) which for n=8 is 16,777,216.

    Simulation is the easiest way to generate a histogram, where there isn't an analytic formula for the compound distribution, which is probably the case here. I use R (r-cran) for simulation. Do you have that? It would only need a few lines of code to make a histogram. The difficulty is working out what you want your buckets to be and how you want to arrange them, since the buckets will be indexed by three dimensions R G and B.
     
  6. Jun 7, 2015 #5
    Andrew, there's a histogram in my first reply to you. Can you not see it? It's my first piccy post.
     
  7. Jun 7, 2015 #6

    andrewkirk

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    Your picture is a histogram of the individual colour values - ie it's actually three separate histograms overlaid on one another and distinguished by the colours of the columns. It is not a histogram of the 3-colour combinations.
     
  8. Jun 7, 2015 #7
    Isn't this the crux of my question? How many buckets would there be? Are you saying that there is no way to calculate /estimate the number of potential buckets without actually measuring it? Does your 101n algorithm not apply any more in light of the histogram..?
     
  9. Jun 7, 2015 #8

    WWGD

    User Avatar
    Science Advisor
    Gold Member

    Aren't you looking to find the distribution of ## X \times X \times X ##, given you know the distribution of ##X##?
     
  10. Jun 8, 2015 #9

    andrewkirk

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    Paul it's necessary to appreciate the difference between:
    1. identifying the different possibilities ('buckets'); and
    2. calculating the expected frequency of occurrence of each possibility.

    Step 1 is trivial for 24-bit colour. The answer is that the number of buckets is ##2^{24}=(2^8)^3=256^3=16,777,216##.
    Step 2 is where distributions and histograms come into play.

    You have three random variables R, G and B, each of which has a discrete distribution over the integers from 0 to 255. If the three distributions are identical, it becomes the problem that WWGD has described. If they are not, as is the case with the three-histogram graph you have shown, then the problem is more complex.

    This is a question about multivariate distributions, not about permutations.
     
  11. Jun 8, 2015 #10
    YES! You've hit it on the head. I just didn't know how to write it down.

    I know X because I have a histogram (shown earlier in this thread). Is the number of permutations really as simple as:-

    (width of red histogram) * (width of green histogram) * (width of blue histogram) ?

    So an approximation would be (33-18) * (53-20) * (29-14) = 7425 permutations /colours...
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Calculating permutations for a normally distributed variable
Loading...