How to fit plane onto sampling data?

  • #1
10
0
For example I have the variables x, y and a probability distribution p(x,y). I want to approximate p(x,y) as a linear function, a plane in this case, at least somewhere in the domain. However I only have samples from the distribution. In case of big amount of data the it is easy to collect them into bins, and fit a plane onto the estimated density function. But I dont want to compress data, I would like to use point itself. Are there any method to do this?
 

Answers and Replies

  • #2
FactChecker
Science Advisor
Gold Member
6,052
2,337
Multiple linear regression would tell you what numbers, a1, a2, a3 would give the best fit of a1*X + a2*Y + a3 through the data. The "best fit" is defined as the one that minimizes the sum of the squared errors between the plane and the data.
 
  • #3
Stephen Tashi
Science Advisor
7,571
1,465
As FactChecker says, to fit a probability distribution, you can do a least squares fit of a plane but you must have the added constraint that the area under the plane integrates to 1. How to add that constraint, is something we'd have to think about.
 
  • #4
Stephen Tashi
Science Advisor
7,571
1,465
As FactChecker says, to fit a probability distribution, you can do a least squares fit of a plane but you must have the added constraint that the area under the plane integrates to 1. How to add that constraint, is something we'd have to think about.
Edit: But doing a linear regression would involve some kind of binning so the data would represent a frequency instead of isolated measurements. Perhaps a maximum liklihood fit is needed if you don't want to bin the data.
 
  • #5
FactChecker
Science Advisor
Gold Member
6,052
2,337
Edit: But doing a linear regression would involve some kind of binning so the data would represent a frequency instead of isolated measurements. Perhaps a maximum liklihood fit is needed if you don't want to bin the data.
Sorry. I suggested regression without thinking enough. Never mind.
 
  • #6
10
0
"binning so the data would represent a frequency instead of isolated measurements.
binning so the data would represent a frequency instead of isolated measurements.
"
This is, what I do not want! So the question remains!
 
  • #7
Stephen Tashi
Science Advisor
7,571
1,465
It's difficult to give good advice about fitting a plane to "data" as a generality. What exactly is this data?

In particular what are the bounds on x and y ? Is there any theoretical reason to believe that (x,y) can exceed the ranges of x or y that were actually observed in the data? To fit a probability distribution, we must use a function whose integral over the possible (x,y) values is 1. So it's important to know if we have a good handle on the bounds of x and y.
 
  • #8
chiro
Science Advisor
4,790
132
What is the resolution and structure of the measurements?
 
  • #9
Svein
Science Advisor
Insights Author
2,129
686
For example I have the variables x, y and a probability distribution p(x,y). I want to approximate p(x,y) as a linear function, a plane in this case, at least somewhere in the domain. However I only have samples from the distribution. In case of big amount of data the it is easy to collect them into bins, and fit a plane onto the estimated density function. But I dont want to compress data, I would like to use point itself. Are there any method to do this?
A plane is determined by three points. You can always extract three points from a distribution. But what if the distribution does not look like a plane?
upload_2015-1-31_19-41-36.png

This is a normal distribution in x and y separately. How would you fit a plane unto that?
 
  • #10
chiro
Science Advisor
4,790
132
You need to pick a non-linear basis in this case rather than a linear one.

I would look at projecting your data points to some function that makes sense. If your function is a bi-variate normal then use that to start off with.

The topic of projection is covered in harmonic analysis and you would be using orthogonal polynomials within some interval.
 
  • #11
Stephen Tashi
Science Advisor
7,571
1,465
I would look at projecting your data points to some function that makes sense. If your function is a bi-variate normal then use that to start off with.
What would it mean to "project a data point"?
 
  • #12
chiro
Science Advisor
4,790
132
In linear spaces (including infinite dimensional spaces) you can project points (or even functions) to a basis.

A good example of something non-linear is when you project a function or a set of data points to sine and cosine bases. These are orthogonal (the proof is straightforward) and the inner product is <f,g> = Integral[a,b] f(x)g(x)dx for the one dimensional case. You can extend it into multiple dimensions.

The idea is that after the projection, you reconstruct it by figuring out the coefficients obtained from projecting it to that basis in much the same way we do projections in normal linear spaces (again using the inner product) and use this to re-construct the space.

In a linear space we use <f,e_x> where e_x is a basis with ||e_x|| = 1 and the same thing happens for infinite dimensional linear spaces like with fourier series, wavelets, and other structures with similar techniques - which are based on Hilbert space theory and harmonic analysis.

This idea of projection of functions or data points (different techniques for both kinds) is done a lot. Image processing, Video processing, Audio processing, compression, signal processing (general signals) and many other applications make use of this.
 
  • #13
Stephen Tashi
Science Advisor
7,571
1,465
Because the OP doesn't want to bin the data, a data point is not a probability. So I don't see how one could "project" a data point onto a function whose values represent a probability density.
 
  • #14
chiro
Science Advisor
4,790
132
You turn it into something that can be projected - like a function.

Interpolation is one way to do this - but it's not the only way.

Once you have a function and a basis to project to you find <f,g> and do the same thing to re-construct the function.

If you are projecting to a Normal distribution then you take the approximated function you get back and get the estimates for parameters using expectations for said final function. You will typically use a set of orthogonal polynomials with enough accuracy (based on the degree of the set of orthogonal polynomials) and then after you reconstruct it you get back a function that represents the new basis.

Normal distributions involve exponential terms but in a given interval you can always approximate it well enough by choosing a high enough degree polynomial.

If you have a lattice of points you would get a function that interpolates through all points - find an orthogonal basis for a bivariate polynomial of large enough degree - use the decomposition process of Hilbert-Schmidt to get the orthogonal polynomials (using <f,g> as above) and then take your interpolated function and project it to the new basis.

After this you get a bivariate polynomial that approximates the bi-variate normal and then you can use that to see what the approximate distribution is.

You don't have to use a Normal distribution - but you do need a basis that "makes sense".

You can use expectation results of the new function to approximate the parameters of the distribution are you are looking for - whatever it may be.
 
  • #15
Stephen Tashi
Science Advisor
7,571
1,465
You turn it into something that can be projected - like a function.

Interpolation is one way to do this - but it's not the only way.
The data apparently has the form (x,y) and the goal is to fit a density function z = f(x,y). What kind of interpolation would you perform on the data? What would the domain and codomain of the function that interpolates be? Unless you bin the data, there is no z value in associated with the (x,y) data.
 
  • #16
chiro
Science Advisor
4,790
132
Piecewise linear would do as a minimum. You could very creative but you could create a surface based on piece-wise linear planes that you "glue" together.

Once you have this then you project it on another basis just like I mentioned above.

So for some box in R^2 you define a plane in that box and you parameterize the space so that for some x in [a,b] and y in [c,d] you have two parameters t and u such that you convert your x's and y's into t's and u's for that region and use that to get the point of the plane. Basically for [a,b] X [c,d] you have a plane and you get two vectors on the plane - one with respect to the x axis and one with respect to y and you parameterize so that t and u go from 0 to 1 that span all edges of the plane in that region.

You can get more creative than this but it is the simplest to do for a function of two independent variables.

Projecting between different spaces is a common thing in fields like signal processing, compressing images, videos, and audio and doing pattern matching and data analysis activities.
 
  • #17
Stephen Tashi
Science Advisor
7,571
1,465
Piecewise linear would do as a minimum. You could very creative but you could create a surface based on piece-wise linear planes that you "glue" together.
Once you have this then you project it on another basis just like I mentioned above.
So for some box in R^2 you define a plane in that box and you parameterize the space so that for some x in [a,b] and y in [c,d] you have two parameters t and u such that you convert your x's and y's into t's and u's for that region and use that to get the point of the plane.
The problem with that there is no given plane in 3-D that is asssociated with a box of the (x,y) data and it isn't clear how to define an appropriate plane without binning the data.
 
  • #18
chiro
Science Advisor
4,790
132
That is correct but you could make the bins small enough to approximate the effect of a continuous space.

It just means that you have a lot of boxes to use. If the bin size is small enough then the departure between a truly continuous distribution and a discrete one will quite large. As long as you have enough points in a given region (i.e. not sparse) then it should be OK and still faithfully capture the nature of the distribution.

Also notice that when you reconstruct the new function with the basis that is purely continuous and smooth the reconstruction will reflect that. You will have a Normal curve, or a Gamma curve, or any other curve after you project your binned data back to your new Hilbert-space basis so even though the binned data are discrete, your new basis will still stay the way it is but it will lose some resolution in the whole projection process.

The key really is choosing the right orthogonal polynomial basis within the region that you are looking at and understanding the nature of projection between binned data and the final linear combination of basis polynomials.

This idea of taking discrete data (liked the binned examples) and then turning it into some "smooth" function is what is commonly done in signal processing - particularly in the areas of data compression (think video, image, audio as examples).
 

Related Threads on How to fit plane onto sampling data?

Replies
3
Views
994
Replies
4
Views
828
Replies
0
Views
3K
  • Last Post
Replies
13
Views
3K
Replies
7
Views
3K
Replies
5
Views
6K
Replies
8
Views
912
  • Last Post
Replies
24
Views
1K
Replies
20
Views
352
Top