Is there a way to create equations from numbers?

  • Thread starter edmondng
  • Start date
  • Tags
    Numbers
In summary, if you have a graph that you want to fit a line to, you can use the regression analysis tool in Excel. This tool uses the "least squares" method to fit a line through a set of observations. You can apportion shares in the performance measure to each of these three factors, based on a set of performance data, and then use the results to predict the performance of a new, untested athlete. However, the regression analysis tool is sensitive to outliers, and may not provide a "best fit" line equation.
  • #1
edmondng
159
0
Like the topic says, is there some software or method you can use (maybe excel?) to create equations?
For example if i knew my graph is going to be a straight line, and i had a few points, i can find the slope, intersection and get my equation.

Then if i knew my graph is a 2nd order and i had a few points like x=2, y=4, x=3 y=8 etc then i can probably guess it is y=x^2

But what if my graph looks like a 2nd order, and the numbers are all over like 2.111, 7.555 etc and how do you get a best fit equation to describe your x and y equation. because the answer maybe y=ax^2 + bx + c and with 10 valid points it may take a long time to get that best fit line equation.

Thanks
 
Physics news on Phys.org
  • #2
In Excel look for the Analysis ToolPak menu. Then in the dialog box look for Regression.

From Excel Help:
The Analysis ToolPak includes the tools described below. To access these tools, click Data Analysis in the Analysis group on the Data tab. If the Data Analysis command is not available, you need to load the Analysis ToolPak add-in program.

...

Regression

The Regression analysis tool performs linear regression analysis by using the "least squares" method to fit a line through a set of observations. You can analyze how a single dependent variable is affected by the values of one or more independent variables. For example, you can analyze how an athlete's performance is affected by such factors as age, height, and weight. You can apportion shares in the performance measure to each of these three factors, based on a set of performance data, and then use the results to predict the performance of a new, untested athlete.

The Regression tool uses the worksheet function LINEST.
 
  • #3
There are a lot of ways to make functions from numbers- that's the problem! Given any finite number of x,y values there exist an infinite number of functions that give those values.

If you are certain the points lie on a parabola, with vertical axis, you can put 3 xy-values into y= Ax^2+ Bx+ C. If the axis is not vertical you might have to use the general conic: Ax^2+ Bxy+ Cy^2+ Dx+ Ey+ F= 0. Since there are 6 coefficients to determine, you would need to use 6 points which will give you 6 equations in 6 unknowns.

Of course, if the points are not really on the parabola, you might want to use a "least squares" method to get the line or parabola that goes closest to those points.
 
  • #4
The "least squares" method is certainly the most widely used parameter estimation technique. What this technique does is find the coefficients (a,b,c in ax^2 +bx +c, for example) that minimize the sum of the squares of the errors (SSE) between the computed and observed values. The SSE is quadratic in the coefficients, and thus the partial derivatives of the SSE with respect to the coefficients is linear in the coefficients. The end result is a simple linear algebra problem that is easy to compute, easy to understand, and easy to implement.

The technique is also very sensitive to outliers. The least squares approach implicitly assumes the data are normally distributed. Outliers are data values that are, for example, ten, twenty, or more standard deviations from the mean assuming a normal distribution. If the data were normally distributed, such errors would never be seen. In reality, data often are not even close to normally distributed. Moreover, sensors and people sometimes report measurements that are, simply put, bad. Because it uses the square of the error, those twenty sigma outliears will throw the least squares solution off track.

Another problem is that the least squares technique does not yield what some consider to be the "best fit". It is essential minimizing the average square error. In many cases it is better to minimize the worst-case error. Doing this let's one state a guaranteed performance. All that least-squares approach provides is some kind of fuzzy, average error. YMMV.

Because of its simplicity, least squares is the parameter estimation technique of choice for many, including myself. One simple way to deal with outliers is find an initial least squares fit, identify the outliers(typically a three or six sigma error threshold), and find another least squares fit with the outliers discarded.
 
  • #5
D H said:
The least squares approach implicitly assumes the data are normally distributed.
Not true. LS does not make distributional assumptions -- the results are unbiased and efficient regardless of the distribution. Normality enters the picture in hypothesis testing, i.e. when ascertaining the level of statistical significance of the estimated parameters.
 
  • #6
ok I'm using excel and here's what i found.
if i plotted my graph, i can right click and add 'trendline'. I can fit and predict future data and display the equation.

Is there away to display the equation in the excel box (not in the chart) since i have like hundreds of data (maybe just the coefficient constant will do). Also, the equation given for example y = 2.1x^2 + 3.1x + 500, returns a valid or close value of y only if x = 1,2,3,4,5 etc

In other words, my x-axis in the graph is not 1,2,3,4,5; my coordinate can be (20,100), (55,350), (100,700)...etc
But the equation returns the close enough y only when substituting x = 1,2,3 rather than 20,55,100. Why is that?

Thanks for help
 
  • #7
edmondng said:
ok I'm using excel and here's what i found.
if i plotted my graph, i can right click and add 'trendline'. I can fit and predict future data and display the equation.

Is there away to display the equation in the excel box (not in the chart) since i have like hundreds of data (maybe just the coefficient constant will do). Also, the equation given for example y = 2.1x^2 + 3.1x + 500, returns a valid or close value of y only if x = 1,2,3,4,5 etc
Refer to my previous post.
In other words, my x-axis in the graph is not 1,2,3,4,5; my coordinate can be (20,100), (55,350), (100,700)...etc
But the equation returns the close enough y only when substituting x = 1,2,3 rather than 20,55,100. Why is that?

Thanks for help
Are your x values bunched up in the "small" units (say 1 through 10)? Also, what do you mean by "a valid or close value of y"?
 
  • #8
found some answers anyway. apparently the 'add trendline' works best if the chart type is "xy scattered" rather than "line". If your chart type is "line", then x = 1,2,3,4 etc. In other words, even if your answer is (10,100), (20,200) etc, the equation will generate based on (1,100),(2,200) etc. So need to change to "xy scattered" type. That was what i meant when i substitute 1,2,3 if its close but not my actual value.

But anyway, excel is buggy in its polynomial equations. for example; it plotted a 'best fit line' and came up with the coefficients and equation.
Substitute x and the y is not even close to the equation. Plug the equation into graphing calculator and still wayyy far off.

This is even after i followed some instruction in MSoft where it says to increase the decimal placing (from say 5 to maybe 30 so be more accurate from the roundoffs).

On another note, i have all the equations and coefficients, is there a way to have excel display values rather than formula? I mean it shows the values but when you click on the box it shows the formula like " = xyLin(A1:A5) etc etc"
I guess i just want to see the number, rather than formula. Yes i use formula to calculate the coefficients but maybe there is a button to turn off formula?

Thanks everyone
 
  • #9
Substitute x and the y is not even close to the equation.
Can you give an example?
is there a way to have excel display values rather than formula
Yes, if you follow the instructions on post #2 in this thread.
 
  • #10
example?

Lets say the numbers i gave was (10,100),(20,200),(30,300),(40,400)
excel would give an equation of y = 100x because it is using (1,100),(2,200),(3,300),(4,400). Unless you change your chart type to "xy scattered", then it will display the correct equation which is y=10x
edit: i was using chart trendline formula
http://support.microsoft.com/kb/211967

I did not use regression because i have tons of data, like thousands plus I'm looking at polynomial. So i had my equation generated and predict future values using array of data. from these equation, i have my coefficients. Is there away for me to get these values to import into my program.

Its like you putting formula =A1+A2
Function shows =A1+A2 but your box shows the value.
I just want to show the value cause my program only sees the string of function not the value.
Of course can manually copy to the next box and import that but with thousands of data isn't really efficient
 
Last edited:
  • #11
I was responding to your
Substitute x and the y is not even close to the equation.
Can you give an example of y being "not even close to the equation"?

For displaying data, (I repeat) follow the instructions in post #2.
 
  • #12
is there a way to post picture or attached? you can take a look at the equation and plot your own and compare to the one given by excel.

as for regression, can you elaborate more? i have thousands of data, independent of each other. right now i just use the trendline formula, to get the values for each of them.

http://img251.imageshack.us/img251/1782/excelkl2.jpg
 
Last edited by a moderator:
  • #13
I see the picture, what do you mean by "y is not even close"?

When x = 12,500 the formula returns y = 25.4195169452. Is this "too far" from the actual y?

In Excel, go to the menu bar (starting with File, then Edit, View, etc.) Find Tools on the menu bar. Click on it. Then look for Data Analysis. In the dialog box, select regression. If you don't see Data Analysis then click on Add-Ins, in the dialog box, check Analysis ToolPak.

My previous post (#2) applies to Excel 2007. The paragraph immediately above applies to previous versions of Excel.
 
Last edited:
  • #14
my data --> (12512,19.50), so i guess (12500,25.41) is a bit off.

my bad initially when i calculate i did like 4 decimal placing and can tell it was really far off. i know i did up to 15 decimal later on few times and still got it to be off but maybe my calculations was off or typo on the calculator.

I did install the add on regression. Mine is excel 2003. It generates a box filled with data, I see the intercept at lower 95%, upper 95%, and coefficient intercept and xCoeff. I suppose y = xCoeff*X + Intercept.

Does this do polynomial (like 2nd order at least). At least with trendline it gives me the coefficient by using the formula and also let's me see the future prediction graph.
 
  • #15
There is no restriction on variables or the functional forms long as you can represent each variable in a separate column (and the columns must be together on a page). For a quadratic, you need to put X and X^2 on two columns next to each other.
 
  • #16
is coefficient for :-

y = (xVariable1)*x + (xVariable2)*x^2 + intercept
X variable 1 = x
X variable 2 = x^2
correlation coefficient = sqrt(R^2)

Thanks
 
  • #17
Correct; except the correlation coefficient is not a simple "y vs. x" correlation.
 

1. How do you create equations from numbers?

To create equations from numbers, you can use basic mathematical operations such as addition, subtraction, multiplication, and division. For more complex equations, you can use algebraic expressions and follow the rules of order of operations.

2. Can equations be created for any set of numbers?

Yes, equations can be created for any set of numbers. However, the complexity of the equation may vary depending on the numbers and operations involved.

3. Are there any specific rules to follow when creating equations from numbers?

Yes, there are specific rules to follow when creating equations from numbers. The most important rule is the order of operations, which states that you must perform operations in a specific order: parentheses, exponents, multiplication and division (from left to right), and addition and subtraction (from left to right).

4. Can equations be used to solve real-world problems?

Yes, equations can be used to solve real-world problems. In fact, many real-world situations can be represented and solved using mathematical equations. This is why equations are such an important tool in science and other fields.

5. Is there a limit to the complexity of equations that can be created from numbers?

There is no specific limit to the complexity of equations that can be created from numbers. However, as the complexity increases, the equations may become more difficult to solve and require advanced mathematical knowledge and techniques.

Similar threads

Replies
3
Views
1K
Replies
20
Views
2K
Replies
2
Views
2K
Replies
36
Views
4K
  • Calculus
Replies
16
Views
2K
Replies
5
Views
1K
  • Calculus
Replies
0
Views
1K
Replies
3
Views
324
  • General Math
Replies
3
Views
1K
Back
Top