# Fitting an ellipse through 3-d data

Tags:
1. Feb 27, 2015

### SubTachyon

Hey, sorry if this is not the right section to post in, the topic is a bit ambiguous.
I've generated a set of coordinate points for the orbit of Mercury (3d cartesian) and now I want to fit an ellipse through it so I can get an accurate estimate of the location of the perihelion. I am using python (as an excuse to learn the language) and I find this task to be a lot more complicated than I've initially hoped:
So far I only managed to plot the 2d projection of the ellipse onto the xy plane, using least squares fitting (alternatively I can do the same with the other two planes) but I am struggling to come up with a way to take it to the third dimension.
I thought maybe I can fit an ellipsoid, or fit a plane through Mercury's ecliptic and somehow reduce it back to a 2d problem, but quite frankly after a lot of googling and browsing of papers I came to the conclusion that I just don't know how to do either of these things. I need some advice.

Could someone please give me a few pointers?
Thank you.

2. Feb 27, 2015

### phyzguy

I suggest you study the meaning of Orbital Elements. There are 5 degrees of freedom that determine the size (1 variable), shape(1 variable), and orientation in space (three variables) of the ellipse. The sixth orbital element determines where along the orbit the planet is. Try writing the equation for the ellipse in terms of these six variables, then use least squares fitting to determine the best values for these variables. One way to do this is as follows: First build a list of (x,y,z) points on the ellipse (in terms of the elements a and e) assuming that it is in the X-Y plane with the perihelion at y=0. Then rotate these points in 3D space using the elements (i, ω, and Ω) as Euler angles. Then calculate the distance of your known coordinate points from this rotated ellipse and minimize the sum of squares.

3. Feb 27, 2015

### SubTachyon

It's very late and I'm about to go to bed so I'll look at it later but thank you! It looks doable! :)

4. Mar 2, 2015

### SubTachyon

Hello again! So I made a function that generates N number of points on an ellipse that is defined by the input variables a, e, i, ω, and Ω. It works fine.
The problem I have now is having to do squared minimization of a function with 5 variables. How do I do that? Pretty much all the examples I found (or from experience) dealt with simple f(x) relationships and I didn't figure out how to deal with more degrees of freedom. :(

5. Mar 2, 2015

### phyzguy

There are lots of ways to minimize a multi-variate function. If you are using Python, the package scipy.optimize has code which allows you to try a number of different methods. Markov-Chain-Monte-Carlo (MCMC) Is another commonly used method which will (eventually) return not only the optimum values of the parameters, but an estimate of the errors associated with your determination. There is a Python package called PyMC that implements this, but there are many MCMC packages available.

6. Mar 2, 2015

### SubTachyon

I did look at scipy.optimize but primarily at the leastsq function which was probably the wrong place to focus on. I'll look around some more now that I know it's not the only option are out there, starting with the MCMC. Thank you. :)

7. Mar 3, 2015

### SubTachyon

So I looked at the MCMC algorithm and I think I am a little bit over my head. :P I am not very well educated in statistics.

I went back and gave scipy.optimize.minimize another shot, but it's not performing very well with 5 variables. In addition to that my dummy ellipse is not centred with the data to begin with so now I also have to optimize x, y and z coordinates as well bumping up the number of variables to 8.

Right now I am thinking of doing the minimization piece-wise: Meaning first try to minimize the xyz coordinates, then the 5 orbital variables, then perhaps try to repeat the cycle few more times? I am hoping that will yield more accurate results but I am pessimistic about getting accuracy greater than if I just used the closest point as perihelion.

My supervisor tells me I should just give up on the fitting and just use the closest point as my perihelion estimate, but I spent too much time on this to give up just yet. I downloaded emcee package and I looked at couple of the examples they provide. Is MCMC likely to do better than scipy's minimize?

Sorry if I am getting annoying with this.
Cheers

8. Mar 7, 2015

### SubTachyon

This is my code at the moment and it produces something like this. Red ellipse is the fit, blue ellipse is the data.
The minimization seems to do a half-decent job at getting the orbital parameters correct but it fails to move the ellipse around properly, even when my initial guess values coincide with those that I used to create the data with in the first place. :S

9. Mar 7, 2015

### Stephen Tashi

From your code, I don't understand what function you are minimizing. If we have an elliptical curve in 3 D, the usual definition of the "square error" between the curve and set of data points would the sum of the squares of the distances between each data point and the curve - which would imply finding the shortest distance between each data point (x,y,z) and the curve. Offhand, finding the shortest distance between a point (x,y,z) and a given 3-D elliptical curve seems to be a non-trivial problem. Do you have a function that implements such an algorithm?

10. Mar 7, 2015

### phyzguy

Stephen Tashi is right, this is what is missing from your code. However, you don't need an analytic method to find the shortest distance between each data point and your trial ellipse. Since your trial ellipse is a list of points, for each data point you can just calculate the distance to each of the points on your trial ellipse and choose the smallest value. Brute force, but it should work. I would start with fewer points in your trial ellipse (say 100?) until you get it to work. You can then increase the number of points to improve the accuracy. How many data points do you have?

11. Mar 8, 2015

### SubTachyon

First of all thank you for taking your time and looking at the code. :)
The function I am minimizing is deviation - line 63. The relevant part probably being:
Code (Text):
...
for j in range(len(data[:,0])):
deviations[j] = np.sqrt((data[j,0]-dummy_ellipse[j,0])**2 + (data[j,1]-dummy_ellipse[j,1])**2 + (data[j,2]-dummy_ellipse[j,2])**2)
total_deviation = sum(deviations)
i.e.: deviationsi = $\sqrt{(x_i - x^,_i)^2+(y_i - y^,_i)^2+(z_i - z^,_i)^2}$
where $x_i$ is the x component of $i$th data point and $x^,_i$ is the x component of $i$th generated point lying on the trial ellipse. I then sum for all deviations to get a total deviation for all the points in a given trial ellipse.
The actual orbital data I generate has density of anywhere between ~80 to ~8000 points per Mercurys orbit, depends on what stepsize I choose. The code I provided actually doesn't use this data at all and instead just generates ~8000 phony data points using the ellipse function so the fitting should be even simpler.

12. Mar 8, 2015

Staff Emeritus
That's not going to work. Your function is assuming that the closest point to point 62 in the data is point 62 on the trial ellipse. That's not the case. If you don't see this immediately, think about it this way - if you entered the same data points in a different order, wouldn't you expect the same fit parameters?

Next, multivariable fitting is tricky. You want to get a test solution that is as close as possible to the real solution. I would recommend working in a coordinate system where x and y are in the orbital plane and z is perpendicular to it, or as close to that system as you can easily get. That means z will be small - zero if you get it exactly right. (phyzguy suggested this) This means when you fit for orbital elements, the inclination in this system will be small. So now instead of 6 parameters to fit for, you only need 5 to start. Your professor's argument that the perihelion is close to the measured point with the smallest mercury-sun difference lest you reduce this to 4. If the inclination is small, the longitude of the ascending node is poorly determined, so you can put any value you like in, and now it's down to 3. If you look at just the shapes and ignore the time, you're down to 2. Fit the two and then start adding terms back in.

13. Mar 8, 2015

### SubTachyon

I made a small modification to the code so now instead of using deviation of point $p_i$ from $p^,_i$ I use deviation of $p_i$ from $p_c$ where $p_c$ is the closest point to $p_i$. Is that what you meant?
I've noticed no improved accuracy of this fit after this change, in fact I didn't observe any difference at all. (except that its more computationally demanding)

I understand that data point 62 won't necessarily be closest to trial ellipse point 62 to start with but I don't see the problem this is supposed to present. Whatever the starting conditions the goal of the minimization is that points 62 should in the end be the two points closest to each other.
Alternatively I could use the method I've just described above, with identical results.

I like the sound of the 'piecewise fitting' approach but I don't know how to execute it without complicating my life even further.
The idea is to construct a plane through my data to serve as the new xy plane of the Cartesian, right? But how do I construct the plane in the first place? Doesn't that require having to minimize the (dx,dy,dz) variables? And additional three angle variables? Just to get the plane to start with?

14. Mar 8, 2015

### Stephen Tashi

One would think this problem would have been solved by astronomers years ago. Yet I haven't found anything about it on the web.

That could be true in special cases - for example, if there is an equal arc length on the elliptical curve between the adjacent points on the curve and roughly an equal arc length between adjacent data points as measured on the underlying "true" elliptical path. That assumption might work well for synthetically generated data where you know there is some "correspondence" between the data points and a set of points you pick on the curve to do the fit. However, that method doesn't accomplish least squares fit in the general case. I agree that if you have a set of closely spaced "test points" on the curve that you can use the distance between the data point and the closest test point to approximate the actual minimum distance between the data point and the curve.

I think you need 8 parameters for the curve fit. For example, 6 parameters to describe the two focii of the ellipse, one parameter to describe the orientation of the plane containing the line between the focii, and one parameter to describe the size of the ellipse. So I don't think the "orbital elements" are a sufficient number of parameters.

15. Mar 8, 2015

### phyzguy

SubTachyon,

Can you post your code after you have made the modification to find the closest point on the trial ellipse? Also, if you can post the data points that you are trying to fit, that would help.

16. Mar 9, 2015

### SubTachyon

Sure, here.
Right now the code generates its own data, to make things easier. But if you want to plot to actual data comment out line 61 and uncomment lines 58 & 59. I uploaded two sets of data: ~1 year of Mercury (88 points) and ~1 year of Mercury (8751 points).

17. Mar 14, 2015