1. Feb 27, 2010

Hi, I have three questions about the application of quadratic approximation, what it is & when to use it. It ties in with a question about linear approximation also, I'll give an example first of what I'm talking about, just for you to evaluate if I'm wrong in the way I see the whole process, I just worry & want to get the logic right you know

So, the formula for linear approximation is $$f(x) = f'(x_0)(x - x_0) + f(x_0)$$. I see a relationship to the slope formula, $$m = \frac{y_2 - y_1}{x_2 - x_1}$$ (via algebra you obtain the linear approximation, & the slope is an infinitesimal which is logical) and this explains to me why the linear approximation will approximate a function about a certain point and not be so accurate as you go further and further from $$x_0$$

Also, from my understanding of this concept, it is a method to approximate a function about a point, say you want to approximate $$\sqrt{9.1}$$ you follow the above framework, setting x = 9 (a known function close by).
$$f(x) = \sqrt{x}$$
$$f'(x) = \frac{1}{2\sqrt{x}}$$
$$f(x) = f'(x_0)(x - x_0) + f(x_0)$$
$$f(x) = \frac{1}{2\sqrt{9}}(x - 9) + \sqrt{9}$$
$$f(x) = \frac{1}{6}}(x - 9) + 3$$
$$f(x) = \frac{1}{6}}x - \frac{9}{6}} + \frac{18}{6}}$$
$$f(x) = \frac{1}{6}}x + \frac{9}{6}}$$

1.The calculator matches the answer here very closely an all is good. How would quadratic approximation fit in here? Does quadratic approximation have any relevance to this equation at all? Am I applying it where it doesn't have any meaning. I only heard about it yesterday & my intuition tells me that it's just a means of getting a more exact answer, is this correct?

2.Where does the derivation for the quadratic approximation come from? The way I understand the linear approx is coming from the slope formula & point-slope formula but I see no way a quadratic fits in.

3. What is the deal with the formula $$(1 + x)^{r} = 1 + rx$$? Using this type of formulation a teacher in an mit lecture finds a linear approximation & then he gets quadratic factors in the answer and then he tells the class to throw them away. http://www.youtube.com/watch?v=BSAA0akmPEU&feature=SeriesPlayList&p=590CCC2BC5AF3BC1 (27th minute). What I wonder is why use this when the normal linear approximation works fine, why memorize unnecessary baggage?

2. Feb 27, 2010

### CompuChip

So the idea of a linear approximation, is to describe a function f in the neighborhood of some point a by a linear function. In terms of the graph of f, we want to find a straight line, such that in the vicinity of a, we can approximate the function by the line. If you think about this for a moment, you will realize that we are talking about the tangent line here. Thus you quickly arrive at the equation
y = f(a) + f'(a) (x - a).
Staring at this, you may realize that f'(a)(x - a) is simply a straight line with the same slope as f has in a, shifted horizontally (x -> x - a) and vertically (y -> y + f(a)) to go through the point (a, f(a)). You can also look at it this way: the general equation for a line is y = c x + d, and we matched the function and its slope at x = a to the value and slope of the line at x = a, thus determining c and d.

This is the first order (or linear) approximation to the function f, and depending on how wild f is near a, there is some region around a in which we are satisfied with this approximation. So we are going to make a better one, by making a "correction" on the linear approximation. This will be a quadratic polynomial, in the form
y = c x2 + d x + e,
where we now have three constants to fix. This means that besides the value and slope of y, we can also match the second derivative of y to be the second derivative of f at a, thereby increasing the region around a for which it is a good approximation.
If instead, you write it as
y = f(a) + f'(a)(x - a) + C(x - a)2
it already looks more like a correction to the y = f(a) + f'(a)(x - a) we had before. Note that when x is very close to a, then (x - a)2 is very much smaller than (x - a) (suppose x - a = 0.001, then (x - a)2 is 0.00001). So very close to a, the correction will be small and we are still mainly using the straight line we had. However, we can fix the value of C such that also farther away, when the quadratic term starts to contribute, y stays a good approximation for f. How we get C is a little technical, but when you do the math you will find that the best possible approximation is obtained when we set C = f''(a) / 2. Basically what you are saying is, that we want a function whose slope not only matches the slope of f, but we also want it to have the same convexity property (for example, if f is concave or convex, we also want the approximation y to be concave or convex, respectively).

The quadratic approximation is not just useful when you want to do better than linear - sometimes, you even need it. Suppose for example that you have a function with a local minimum or maximum at x = a (like cos(x) at x = 0). In that case, f'(a) = 0 and the linear approximation will be a horizontal line y = f(a). The first useful approximation is then a quadratic one, as for example in this image. (This is very common in physics, where minima and maxima are important).

Of course, we can repeat this trick, and include also terms cubic in the deviation from a, quartic and even higher. This means you can write
$$f(x) = T_n(x, a) + E_n(x, a)$$
where Tn is the nth order approximation
$$T_n(x, a) = f(a) + f'(a)(x - a) + \frac{1}{2} f''(a) (x - a)^2 + \frac{1}{6} f'''(a) (x - a)^3 + \frac{1}{24} f^{(4)}(a) (x - a)^4 + \cdots + \frac{1}{n!} f^{(n)} + En(x)$$
with the f(k)(a) notation referring to the kth derivative of f at x = a.

The function En(x, a) is the error (defined as the difference between the exact value of f(x) and the approximation given by all-but-the-last terms on the right hand side). The idea is, that if f is a "nice" function (and most functions we encounter in our daily mathematics are "nice") that E becomes smaller and smaller as we increase the order n of the polynomial, and in fact
$$f(x) = \lim_{n \to \infty} T_n(x, a).$$

So we have a set of polynomials of every-increasing order, which approximate the function f(x) around x = a with ever-increasing accuracy. When we stay close to x = a, usually the second order is already negligible with respect to the linear term. In that case, we may say: "okay, y = f(a) + f'(a)(x - a) is already a sufficiently accurate representation of f(x) around x = a, and we forget about all the other terms." This is what the professor meant by "throwing those terms away". This may seem a bit rigorous at first, but in most applications, it turns out that the quadratic and higher order terms really are small enough to ignore, without making huge errors in your calculations. Only when you go very far from x = a, or when f varies wildly around x = a, or you need very accurate results, it may be necessary to take more terms into account.

If this all facinates you, you should go find yourself a good resource on "Taylor series". I checked out the Wikipedia page, but I actually find it a bit unclear and sloppy. I'll leave it to others to recommend a good analysis book or something like that, which explains this in detail.

For now, I'll leave it at this... I'll let you chew on it and come up with more questions.