# How useful would an equation finder be ?

you got x, y pairs from somewhere (maybe an experiment) and plug into program and pops up a simple equation that best models the pairs (e.g. y = e^x) and curve fitted as well.

The reason I'm asking is cause I made such a program, and want to know if such a thing is useful to others.

Nice work, but I wonder how it compares to the trendline options in MS excel.

OK, I've taken a look at trend line from excel. Trendline you have the option for log, mx +b, or polynomial. The most noticeable difference of what I have created is that the program picks what it thinks the best function form is (e^x, x, x^k etc...) and curve fits it. And lastly I don't have a n-degree polynomial because you can pretty much fit anything to an n degree polynomial.

My goal is: Come up with a function that uses least amount of free parameters, which minimizes error for a specific data set of (x, y) pairs.

I got really hyped up about how "it would be so cool if a program can just tell me the equation that models these points" so hyped that I wrote it. Spent time working out equations how to predict the form of equation, and then fitting it etc. etc...

Right now I got it working only for monotonic functions, for all points x >= 0, y >= 0. And positive slope everywhere. I already worked out (on paper not implemented yet) how to elevate those restrictions, even some forms of analysis I have worked out to do in log(n) time. I have very basic thing working but don't know how useful it is or can be.

Pythagorean
Gold Member
Matlab has a built in function fitter too. I don't use it post-graduation. Used it a lot for labs as an undergrad.

I do my data fitting in Origin, which has provided fits (or you can write your own -- I used it to fit a nonlinear surface-optical property that had 3-9 complex tensor elements -- i.e. 3-9 numbers with "phases"/6-18 different parameters). In this program (like many others) you can also weight the points with data error, which is useful (some regions of the data (looking at reflected intensity and polarization for a given incident polarization) were pretty weak and had high error despite longer data collection time, and some data with high intensity data were very accurate even with just a few shots (of the pulsed laser).

Generally people use commercial software (of various levels -- Origin is pretty high, as is "Igor" -- in my opinion Excel doesn't match these), write their own (as you've done) or use some combination (i.e. Origin and Igor allow more complex user-defined fits, in my opinion in a better way than Excel).

If you're looking to commercialize this, it would be hard to compete. If you're looking to provide it as freeware or for online use, I still think it'd be hard to compete. Still, it's good that you're getting the experience.

Pythagorean
Gold Member

Jonathan Scott
Gold Member

Sorry - I'm a subscriber and I didn't spot that it had signed me automatically in to view that.

The article implies it works like genetic algorithms. It makes lots of guesses, and keeps the ones that work best and combines or refines them.

you could probably use a little of this as a generic way to converge on solutions:
http://en.wikipedia.org/wiki/Calculus_of_variations

still need to explore your search space a bit as the number of parameters go up, lest you get stuck in local minima. or, http://en.wikipedia.org/wiki/Simulated_annealing" [Broken]. been a long time since i looked at it, but from what i remember, the basic concept of how to do it is a lot easier than the hieroglyphs.

optimization is a pretty rich topic, tho. you may not even want to use a generic algorithm. as for genetic, it seems more a trendy thing to me. but look, http://en.wikipedia.org/wiki/Genetic_algorithms" [Broken]!

Last edited by a moderator:
Pythagorean
Gold Member
I've used a genetic algorithm with haploid crossover to tune the parameters of a neural model so that it matched experimental results. It worked out pretty well for the short time I spent on it (class project). I could match the variation in the experiment with spike time prediction and even the noise levels had the same shape!

I made a function fitter on my TI-84. I just made a program where it computes all the regression equations for the given coordinates and shows the function with the greatest r. Not too hard.

Are you going to limit the scope to functions that are linear in the free varaibles or include nonlinear ones such as:

http://www.itl.nist.gov/div898/strd/nls/nls_main.shtml

Thurber and Eckerle4 are interesting ones....

I've been looking for datasets & equation pairs to test my stuff out. Thank you. And I'm happy to see my program is not bad on the easy level (monotonic ones). I don't get the same formula, but close approximation. Heres an example of what my program generates:

For Hahn1: 3.549411*x**0.259436 where ** is power
for Chwirut2: 79.030534*exp(-0.505137*x)
for Thurber 336.677542*x + 1073.771926:
for Eckerle4: 0.000177*x + -0.003680

As you can see I do nonlinear free variables all ready.

physics girl phd said:
If you're looking to commercialize this, it would be hard to compete. If you're looking to provide it as freeware or for online use, I still think it'd be hard to compete. Still, it's good that you're getting the experience.

Your right, I checked out origin, they got a real nice product going I see no small step for me to take to potentially commercialize this. Even if the program can guess the form of the equation (I thought of ways to do it without brute force going through a list BTW), it doesn't seem useful enough cause for a human, guessing the form is not such a big deal.

thing is, the hard work on curve fitting has probably already been done for you. i think Matlab based a lot of its routines off of public domain algorithms put together by people working on government contracts and such. and you can get free similar packages like Scilab from INRIA. here's a free optim routine (free to use, not copy). http://help.scilab.org/docs/5.3.0/en_US/optim.html

from what i remember, matlab used to document the math packages they used in the m files. scilab may do the same. i think scilab is at least open-source, though copyrighted.

Personally despite the competition I think this will be more useful than something like Origin, cause of the learning curve is simpler. Theres lots of people doing research, and also new people comming into the research feild, they all need to graph and analyse their results in someway. Matlab (and alternatives) its not simple as copy & paste xy pairs click button and viola you have a graph. I think so strongly that what I have is more useful due to its simplicity that I made a site, which not only shows the equation, but graphs it too. I think something like this will be able to change how research is done by having computers come up with equations, analyse in new ways. Wolfram alpha is really smart it can already solve plenty of different types of equations. I think one day we will have something thats just as smart, but will tell you cool, and non-obvious info about your research data.

This way people doing research can look at a graph alot sooner than issue the plot command in Matlab or other alternative. Its simple and transformations could be easily done online. when I was doing research I did the "plot" command alot. I'm an experienced programmer so really after I got my data (finding faces) I had a script that just drives the entire cycle (show progress bar, Including opening up the graph pictures, and etc... ). But not everyone around me constructed such a program for their research.

Most people here posted how to do things and algorithms. Thank you. I looked into each one, my favourite idea is using heuristic analysis from genetic algorithms. Right now I made the online site for my self so I can graph data for me cause its faster than issuing the plot command. But if I show you a online utility that you can litterly copy&paste your xy pairs (example xy pairs format at bottom), click "go" it will plot, try and predict a curve fitted equation for you. And show other statistics/analysis. With this simplicity would you care, find it useful, and use it?