Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Mathematica: Binning 2D data

  1. Feb 21, 2012 #1

    Say I have a data set (x_i, y_i) such as of the form

    {4.4077, 8.41282*10^-7},
    {9.39964, 3.3636*10^-6},
    {14.3781, 7.56237*10^-6},
    {19.3462, 0.00001343},
    {24.3073, 0.0000209557},

    The first coordinate is time, the second a weight. The sum of the
    weights equal 1, so I am dealing with a probability density function.
    I want to bin all the data points such that data points with their x-
    coordinates within some range delta_x are binned together, and the
    total y-coordinate of that bin should be the sum of all the y's of the
    data points in the respective bin. So basically something like a

    Is there a function for doing this in Mathematica?

    Best regards.
  2. jcsd
  3. Feb 21, 2012 #2
    If you had 1D data then BinLists[] would group your data.

    For 2D data perhaps GatherBy[] with a function you construct that takes the First[] of your pair and returns an integer which represents the bin you would like the data to go into. Piecewise[] might be appropriate to construct that function, but you would need to figure out how to use that.
  4. Feb 25, 2012 #3
    Thanks, I'll give it a shot.

  5. Feb 25, 2012 #4
    A question very similar to yours came up here


    and there is some useful information there. For example, a person points out that BinLists can handle a list of vectors, not just a list of scalars. If you click on


    and then click on More Information and later on Scope then you can see some examples of how it can handle vectors. I missed this completely on my first reading because I didn't drill down into those subsections. But even after reading it I am uncertain how to use this do do exactly what you are looking for.
  6. Feb 26, 2012 #5
    I have explained the background behind why I am doing this operation here: https://www.physicsforums.com/showthread.php?t=581194. If it turns out that my reasoning has been wrong, then maybe I wont even have to do the binning. But I have to wait and see what smarter people would do.

    Thanks for taking the time.

    Last edited: Feb 26, 2012
  7. Feb 26, 2012 #6
    Having no idea what your density function is, either for position or velocity, I don't know whether this would be possible or not, but I will ask.

    Is there any way you could define distributions for position and velocity and then apply (not in the Mathematica sense) the velocity to the position and come up with new distributions for the resulting position and velocity?

    If you could do that then it sounds like that might accomplish what I think you are trying to do, although you might not have thought of it in these terms.

    I ask because years ago I had a good application for being able to simulate with the fundamental elements being distributions, not a sample of scalars. I didn't spend the time to track down whether anyone had found a good way of doing this or determined that it was not feasible.
  8. Feb 27, 2012 #7
    It is a Maxwell-Boltzmann distribution. I only look at the velocity, so I don't see any way to implement your idea, but it is a good suggestion. I will keep thinking about this.

    Worst case, I guess the easiest way is just to use a couple of loops. I just thought Mathematica might have some smart function to do it for me - it usually does!
  9. Feb 27, 2012 #8
    If you want to post a VERY simple example, "here are 3 position, velocity pairs as input and what I would like to have are these 3 position, velocity as output" or some other very simple tangible example without a lot of irrelevant details to wade through then someone might be able to show you a line or a few lines that would do what you want to do.
  10. Mar 2, 2012 #9
    Sorry for not replying until now, but I believe I have fixed my problem. What I noticed was that all the "degenerate" data points were bunched around some value X, so I just divided the data into two lists - one with points near X and points far from X. From there on the binning is trivial.

    However, this has produced a new problem regarding interpolating functions. I think it is best I create a new thread about this. But thanks for helping me this far, that was very kind of you.

    Best regards,
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook