Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

[Fortran90]Problem with Gaussian distribution

  1. Jul 18, 2015 #1
    Hi,
    I have a problem in my program and I cannot figure it out.
    In the last post I had a problem about some arrays, I perfectly resolved it thanks to you, but now my problem is a little bit subtle. I have a subroutine(here I'll post it has a program )that generates random numbers in order to create errors so that those must have a Gaussian distribution and it is perfect, the program works perfectly, I can have a really nice Gaussian until I centred my points (that is what I must do to associated the errors to my data). After I centred the points I get a weird accumulation of point near x=1 that I can't explain.
    Here is the program I'm talking about
    Code (Fortran):
    Program gauss
    implicit none

    character(len=6)::gaussn
    real,dimension(10000)::m,a,ltot,pron
    REAL::x,l
    INTEGER::i,n,k,p,j,f


    n=10000

    print*,"scrivere il nome del file"

    read*,gaussn


       OPEN(UNIT=1,FILE=gaussn)

    do f=1,n

    read(1,*)m(f)

    pron(f)=2*0.1*(m(f)-0.5)
    end do
    close (unit=1)


    a=0

    l=0.1/700


    do i=1,n


    p=int(pron(i)/l)

    a(p+(n/2+1))=a(p+(n/2+1))+1
    !a(p)=a(p)+1
    end do

    do j=1,n
    ltot(j)=-(n/2.0)+j

    write(unit=15,fmt=*)ltot(j),a(j)
    end do




    end program gauss
     

    The file that it reads is a column of data that I generate in my main program. I printed two files, one with the already centred points and one with the not centred ones. Now I would expect those two sets of points to give a Gaussian distribution but a you can see those are pretty different.
    This pron(f)=2*0.1*(m(f)-0.5) is the line that I use to centre my points, it's something that we did in class with the teacher. I even thought that maybe it was the minus sign that was bothering fortran and I did some experiments like changing it with a plus and defining m(f) as negative. In the main program if I use the centred points I have a "lack" of precision, if I expect my output to be -3 I'll get -2.993... with an error of order 10-7 when with the not centred ones I get -3.001... with an error of order 10 -3.
    I can't understand the problem if there is one or it is just me, maybe I'm missing something.
    Thanks in advanced.
     

    Attached Files:

  2. jcsd
  3. Jul 18, 2015 #2
    I don't quite get what you are trying to do...can you try again to better explain what you mean by "centering" the data? In which direction? x? y?

    f is just the array index, that's fine. But if m(f) is the y value of your data, what does "-0.5" had to do with centering? It is no like the maximum value is 1.0! If you want to center the data in the y direction (and allow negative values) then, you need to subtract half the maximum value from all data points, not just "half" (0.5). You maximum value is over 200...I don't see how taking 0.5 away from it centers it. In the same line, you divide by 10...are you scaling everything down? and then multiply by 2...scaling again?

    In the following loop you take pron and multiply by 7000! what is that about? Then, you take these function values and mix them up with the index variable n, in effect, you are kind of transposing the whole thing using function values as your index...is this on purpose? I am confused.
     
  4. Jul 18, 2015 #3
    I am sorry for not being clear enough. In the program I posted, m(f) are random numbers with gaussian distribution. In reality I tried to obtain random numbers with a gaussian distribution centred at the origin in two ways, but I always run into the same problem. My first attempt was to compute random numbers with uniform distribution between -0.1 and 0.1 and then to use them in order to create random numbers k(f) with a gaussian distribution centred at the origin. To compute them I used the following code
    Code (Fortran):
    do i=1,n

           call random_number(x(i))

    end do

    k(f)=sum(x)/n
    where the values of x(i) are random numbers between 0 and 1. The problem is that I would like to have a gaussian distribution centred at 0 and not at 1/2, which should be the centre of the gaussian distribution generated above because it is the distribution of the mean values of random numbers with expectation value 1/2, if I'm not wrong. Moreover, even though the fortran function random_number(x) generates random numbers between 0 and 1, I would like to have them between -0.1 and 0.1. So I needed to multiply the random numbers generated by random_number(x) by 2*0.1, which is the maximum width, and after that I needed to subtract half of the maximum value, which is 0.1, so that my random numbers should be of the form randomnumber=2*0.1*(x-0.5) where x is generated with random_number(x). But in doing so I ran into the problem I described above, because if I only plotted my gaussian distribution obtained with the code above I had no problem, but if I plotted my k(f) calculated with the following code
    Code (Fortran):
    do i=1,n

           call random_number(x(i))
           randomnumber=2*0.1*(x(i)-0.5)

    end do

    k(f)=sum(randomnumber)/n
    i found that weird crazy point. So I tried a second attempt, which is the one I quoted in my first post. This time I computed my k(f) with the first code, not including the line randomnumber=2*0.1*(x(i)-0.5), and then I tried to shift these numbers with gaussian distribution centered at 1/2 in order to obtain random numbers with a gaussian distribution centred at the origin. I thought that the same line could be used in order to shift the data because I still wanted a maximum width of 2*0.1 and still wanted the centre to be 0, but I am not sure of this. In the program above, the m(f) correspond to k(f). However I run again into exactly the same problem.
    The second part of the program is used to make the gaussian distribution. I defined "l" to be the width of a single interval, that is I wrote l=0.1/h where h is the number of intervals I want between 0 and 0.1. After that I computed p=int(pron(i)/l). In this way I should be able to compute the integer part of pron(i)/l where pron(i) are my shifted random numbers of the second attempt. Doing so I know that the pth interval contains one of my random numbers, in this case pron(i) (rigorously speaking it should be the (p+1)th but I don't think it should change much), and to keep track of this I add 1 to the (p+n/2+1)th component of the array "a" that I defined to be a=0 before the do cycle. I added n/2+1 to the index of "a" so that, when p=0 I obtain that the random number is assigned to the (n/2+1)th component. Finally, to do the plot of the gaussian distributions, I defined an index ltot(j) to be ltot(j)=-(n/2.0)+j so that, when I plot the points (a(i), ltot(i)), I should obtain a gaussian distribution centered at the origin. That was what I thought, but obviously there is a problem somewhere. I hope to have been more clear!
     
  5. Jul 18, 2015 #4
    maybe you should google a bit more...you are not the first one trying to do this.

    B the way, the manner to get random numbers within a range is: min + (max - min)*random_number(x) and not that way you are doing it...it is important that the multiplication (scaling) be done while all numbers are still positive as the scaling reduces the magnitude of the number towards zero.

    Here is the page of some guy who is nice enough to share his experience; once in that page, search for the word "normal" (which is another name for gaussian distribution).
     
  6. Jul 19, 2015 #5

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    I so strongly disagree. The page to which you referenced did things exactly wrong. (Prima facie evidence: That guy uses the original Numerical Recipes ran1. That's been known to be bad since sometime in the previous millennium. There is no reason to use something that isn't at least as good as Mersenne Twister nowadays.) In general, it's much better to use the following to convert a number randomly drawn from U(0,1) to a number randomly drawn from U(min,max) than your min+(max-min)*u_0_1_random_deviate:
    Code (Text):
    r = u_0_1_random_deviate
    u_min_max_random_deviate = max*r + min*(1.0-r)
     
  7. Jul 19, 2015 #6
    D H:
    Well, I am no random number expert and was simply providing a link since the OP could use some examples. I don't exactly know what you mean by Numerical Recipes ran1...the ran1 function in the referenced page does nothing but call Fortran90 instrinsic function random_number().

    By the way, the min+(max-min)*u_0_1_random formula is not my formula...in any case, it is exactly the same as yours.

    Other than that, forgive me if I don't share your passion, but please keep cool and don't kill the messenger, I am just trying to help.
     
  8. Jul 20, 2015 #7

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    If you had infinite precision, yes, they would be exactly the same. You don't have infinite precision. Your example uses double precision.

    Here's a key problem with using min+(max-min)*u_0_1_random instead of my recommended approach, max*u_0_1_random+min*(1.0d0-u_0_1_random). Suppose u_0_1_random is exactly one. Your version reduces to min+(max-min). On the other hand, my version reduces to max in this case.

    There is no guarantee that min+(max-min) is equal to max. There are in fact plenty of pairs of IEEE floating point numbers min and max such that min+(max-min) > max.
     
  9. Jul 20, 2015 #8
    I see, thanks for the explanation.
     
  10. Jul 20, 2015 #9

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    Your k(f) does not have a gaussian distribution. It has a Bates distribution, which is close to normal thanks to the central limit theorem and thanks to special properties of the uniform distribution. (It takes a lot more trials for the sum of independent exponentially distributed random variables converge toward a normal distribution than it does for the sum of independent uniformly distributed random variables.)

    You can use this as an approximation of a guassian distribution, but you need to account for two things:
    - The mean is not zero (it is 1/2)
    - The variance is not one (it is 1/(12*n)).

    You would have something close to [itex]N(0,1)[/itex] (the standard normal distribution, with mean 0 and variances 1) if you had used k(f)=sqrt(12.0d0*n)*(sum(x)/n-0.5). Converting from [itex]N(0,1)[/itex] to [itex]N(\mu,\sigma)[/itex] is fairly straightforward: Multiply by sigma and add mu.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: [Fortran90]Problem with Gaussian distribution
  1. Fortran90 help (Replies: 5)

Loading...