Math terminology in my Taylor Series expansion?

AI Thread Summary
The discussion revolves around the correct use of mathematical terminology in the context of a truncated Taylor Series expansion. The primary concern is the notation for partial derivatives, specifically how to represent the relationship between the function and its variables when multiple values are involved. The author grapples with whether to use partial derivatives with respect to a single variable or to maintain consistency with multiple variables, leading to confusion in the notation. Ultimately, the author proposes a clearer definition of functions and their derivatives, which resolves the inconsistencies and aligns with standard mathematical practices. The conversation highlights the importance of precise terminology in mathematical expressions to avoid misinterpretation.
hotvette
Homework Helper
Messages
1,001
Reaction score
11
I have another dilemma with terminology that is puzzling and would appreciate some advice.

Consider the following truncated Taylor Series:
$$\begin{equation*}
f(\vec{z}_{k+1}) \approx f(\vec{z}_k)
+ \frac{\partial f(\vec{z}_k)}{\partial x} \Delta x
+ \frac{\partial f(\vec{z}_k)}{\partial \beta_1} \Delta \beta_1
+ \frac{\partial f(\vec{z}_k)}{\partial \beta_2} \Delta \beta_2 + \dots
+ \frac{\partial f(\vec{z}_k)}{\partial \beta_n} \Delta \beta_n
\end{equation*}$$
where:
$$\begin{align*} f &= f(\vec{z}\,) &
\vec{z} &= (x;\vec{\beta}\,) \\
\vec{z}_k &= (x;\vec{\beta}\,)_k &
\vec{z}_{k+1} &= (x;\vec{\beta}\,)_{k+1} \\
\Delta x &= x_{k+1} - x_k &
\Delta \beta_j &= (\Delta \beta_j)_{k+1} - (\Delta \beta_j)_k \\
\vec{\beta} &= (\beta_1, \beta_2, \dots, \beta_n)
\end{align*}$$
Then form the following function with the truncated Taylor Series:
$$\begin{equation*}
L = \sum_{i=0}^m \left[ f_i
+ \frac{\partial f_i}{\partial x_i} \Delta x_i
+ \frac{\partial f_i}{\partial \beta_1} \Delta \beta_1
+ \frac{\partial f_i}{\partial \beta_2} \Delta \beta_2 + \dots
+ \frac{\partial f_i}{\partial \beta_n} \Delta \beta_n - y_i
\right]^2 \end{equation*}$$
where:
$$\begin{align*}
f_i &= f(z_k) &
z_k &= (x_i; \beta)_k \\
\Delta x_i &= (x_i)_{k+1} - (x_i)_k &
\Delta \beta_j &= (\Delta \beta_j)_{k+1} - (\Delta \beta_j)_k \\
\vec{x} &= (x_1, x_2, \dots, x_m) &
\vec{y} &= (y_1, y_2, \dots, y_m)
\end{align*}$$
The ##\partial x_i## in the second term of the sum matches the ##\Delta x_i## but doesn't seem right because it implies ##f=f(\vec{x};\vec{\beta})## which isn't true, but using ##\partial x## instead doesn't seem right either because it doesn't match ##\Delta x_i##. How to handle?
 
Science news on Phys.org
I guess that L would be :
1728781594511.png

Does it make sense ?
 
Thanks for you comment (and catching the error on the sum index). If I wanted to make the expression for L more compact, it would be:
$$\begin{equation*}
L = \sum_{i=1}^m \left[ f_i - y_i + \frac{\partial f_i}{\partial x_i} \Delta x_i
+ \sum_{k=1}^n \frac{\partial f_i}{\partial \beta_k} \Delta \beta_k \right]^2
\end{equation*}$$
but the dilemma remains with ##\partial x_i##. It's a strange situation; ##f## has only one ##x## as an independent variable but the sum involves ##m## values of ##x##. I'm tempted to use:
$$\frac{\partial f_i}{\partial x} \Delta x_i$$ but it just doesn't look right.
 
In my hand writing I introduced j as another dummy index for sum as you did with k. Does it help you ?
 
I don't think so. The dilemma really shows up when L is expanded, where we let ##r_i = (f_i - y_i)##:
$$\begin{align*}
L &= \left[ r_1 + \frac{\partial f_1}{\partial x} \Delta x_1
+ \sum_{k=1}^n \frac{\partial f_1}{\partial \beta_k} \Delta \beta_k \right]^2
+ \left[ r_2 + \frac{\partial f_2}{\partial x} \Delta x_2
+ \sum_{k=1}^n \frac{\partial f_2}{\partial \beta_k} \Delta \beta_k \right]^2
\\
&+ \dots
+ \left[ r_m + \frac{\partial f_m}{\partial x} \Delta x_m
+ \sum_{k=1}^n \frac{\partial f_m}{\partial \beta_k} \Delta \beta_k \right]^2
\end{align*}$$
or (?)
$$\begin{align*}
L &= \left[ r_1 + \frac{\partial f_1}{\partial x_1} \Delta x_1
+ \sum_{k=1}^n \frac{\partial f_1}{\partial \beta_k} \Delta \beta_k \right]^2
+ \left[ r_2 + \frac{\partial f_2}{\partial x_2} \Delta x_2
+ \sum_{k=1}^n \frac{\partial f_2}{\partial \beta_k} \Delta \beta_k \right]^2
\\
&+ \dots
+ \left[ r_m + \frac{\partial f_m}{\partial x_m} \Delta x_m
+ \sum_{k=1}^n \frac{\partial f_m}{\partial \beta_k} \Delta \beta_k \right]^2
\end{align*}$$
I think the first one is mathematically correct because ##f_i=f(x_i;\vec{\beta})=f(x;\vec{\beta})|_{x_i}##, but it seems to me in violation of nomenclature for Taylor Series. Maybe that's OK?
 
Last edited:
Let me say what I think I understand.
##f_i## is n+1 variable function ##f_i(x_i;\beta_1,\beta_2,...,\beta_n)## and partial derivatives are taken keeping other n variables are constant. i.e.
\frac{\partial }{\partial x_i}|_{\beta_1,\beta_2,...\beta_n}
where usual way of
\frac{\partial }{\partial x_1}|_{x_2,x_3,...x_m}
does not apply, and
\frac{\partial }{\partial \beta_1}|_{x_i,\beta_2,...\beta_n} := (\frac{\partial }{\partial \beta_1})_i
,and so on. I introduced RHS to make it clear that these partial derivative operators are different for different i.

In full details
L=\sum_{i=1}^m [\ r_i+[\ \triangle x_i \frac{\partial }{\partial x_i}|_{\beta_1,\beta_2,..,\beta_n}+\sum_{j=1}^n \triangle \beta_j \frac{\partial }{\partial \beta_j}|_{x_i,\beta_1,...,\beta_{j-1},\beta_{j+1} ..,\beta_n} \ ]\ f_i\ \ ]^2
Using the above said convention
L=\sum_{i=1}^m [\ r_i+[\ \triangle x_i \frac{\partial}{\partial x_i}+\sum_{j=1}^n \triangle \beta_j (\frac{\partial}{\partial \beta_j})_i\ ]\ f_i\ \ ]^2

[EDIT]
Let us introduce m+n dimension vector
\mathbf{q}=(q_1,q_2,...q_{m+n})=(x_1,x_2,..,x_n,\ \beta_1,\beta_2,...,\beta_m)
L=\sum_{i=1}^m [\ r_i+ \sum_{k=1}^{m+n} \triangle q_k \frac{\partial f_i}{\partial q_k}\ ]^2
=\sum_{i=1}^m [\ r_i+ \triangle \mathbf{q} \cdot \frac{\partial }{\partial \mathbf{q}}f_i\ ]^2
We get a simple formula at the expense of many zeros in the sum because ##f_i## is function of ##x_i## and not of ##x_k## so
\frac {\partial f_i}{\partial x_k}=0
with no regard to partial differential conditions.

This formula is useful for multiple-x's-function of ##f_i## case, i.e.
f_i(x_1,x_2,..,x_n,\ \beta_1,\beta_2,...,\beta_m)
which is extended from the original
f_i(x_i,\ \beta_1,\beta_2,...,\beta_m)
 
Last edited:
I decided to simplify and go back to fundamentals as a way to try to sort this out. Below is the result. Start with the following definition of truncated Taylor Series:
$$\begin{equation*} f(x) \approx f(x_0) + \frac{d f(x_0)}{dx} (x-x_0) \end{equation*}$$
then the following should be valid (where each ##x_i## is a distinct point and ##x'_i## is nearby ##x_i##):
$$\begin{align*}
&f(x'_0) \approx f(x_0) + \frac{d f(x_0)}{dx} (x'_0-x_0) \\
&f(x'_1) \approx f(x_1) + \frac{d f(x_1)}{dx} (x'_1-x_1) \\
&f(x'_2) \approx f(x_2) + \frac{d f(x_2)}{dx} (x'_2-x_2) \\
&f(x'_m) \approx f(x_m) + \frac{d f(x_m)}{dx} (x'_m-x_m)
\end{align*}$$
Form the sum:
$$\begin{align*}
&L = \left[ f(x_1)-y_1 + \frac{df(x_1)}{dx} (x'_1-x_1) \right]^2
+ \left[ f(x_2)-y_2 + \frac{df(x_2)}{dx} (x'_2-x_2) \right]^2 \\
&+ \dots + \left[ f(x_m) - y_m + \frac{d f(x_m)}{dx} (x'_m-x_m) \right]^2
\\\\
&L = \sum_{i=1}^m \left[ f(x_i) - y_i + \frac{d f(x_i)}{dx} (x'_i-x_i) \right]^2 \\
&L = \sum_{i=1}^m \left[ f_i - y_i + \frac{df_i}{dx} \Delta x_i \right]^2
\end{align*}$$
where we let ##f(x_i) = f_i## and ##(x'_i-x_i) = \Delta x_i##. Is there any flaw in the above? I can't see any.
 
WRT the last formula I would like to propose
L=\sum_{i=1}^m[f_i-y_i-f'_i\triangle x_i]^2
where
f'_i=f'(x_i)=\frac{df}{dx}(x_i)
to avoid possible confusion of ##x## or ##x_i## in the derivative denominator.

With m sets values of ##\{x_i,\triangle x_i\}## given and functions f(x) f'(x)and y(x) in
y_i=y(x_i)
known, I can calculate L.

Now I noticed my misunderstanding of you in the previous posts.
I took ##x_i## is one of variables ##x_1,x_2,....##. Now I know ##x_i## is a value of single variable x.
I would like to amend the formula of L as
L=\sum_{i=1}^m [\ r_i+\ \triangle x_i \frac{\partial f}{\partial x}(x_i;\beta_1,\beta_2,..,\beta_n)|_{\beta_1,\beta_2,..,\beta_n}+\sum_{j=1}^n \triangle \beta_j \frac{\partial f}{\partial \beta_j}(x_i;\beta_1,\beta_2,..,\beta_n)|_{x,\beta_1,...,\beta_{j-1},\beta_{j+1} ..,\beta_n} \ \ ]^2
We may write it in brevity
=\sum_{i=1}^m [\ r_i+\ \triangle x_i f_{x\ i}+\sum_{j=1}^n \triangle \beta_j f_{\beta_j\ i}\ ]^2
with convention for function
\frac{\partial f}{\partial \alpha}:=f_\alpha
and for its output value wrt input value ##x_i##
\frac{\partial f}{\partial \alpha}(x_i):=f_{\alpha\ i}
omitting mention of ##\beta##s

Notations
\frac{df}{dx_i},\frac{df_i}{dx},\frac{df_i}{dx_i}
are misleading because suffixed ones are not variables but the values which are inappropriate as
\frac{df}{d\ 2},\frac{d\ 10}{dx},\frac{d 10}{d\ 2}are.
 
Last edited:
Does the following work?
\begin{equation*}
L = \sum_{i=1}^m \left[ f_i - y_i + \Delta x_i \cdot \frac{\partial f}{\partial x} (x_i; \vec{\beta})
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f}{\partial \beta_k} (x_i; \vec{\beta}) \right]^2
\end{equation*}
with the clarification that:
\begin{equation*}
\frac{\partial f}{\partial x} (x_i; \vec{\beta}) = \frac{\partial f}{\partial x} |_{x_i; \vec{\beta}}
\end{equation*}

I had a completely different line of thought. The following is directly from my Advanced Calculus book (Kaplan, fourth edition 1957), p370:
\begin{equation*}
f(x,y) = f(x_1, y_1) + \left[ \frac{\partial f}{\partial x}(x-x_1) + \frac{\partial f}{\partial y}(y-y_1) \right] + \dots
\end{equation*}
all derivatives evaluated at ##(x_1, \,y_1)##.

But, what if I wanted to know ##f(x, y)## for specific values of ##x## and ##y##, say ##x=3## and ##y=4##. Isn't the following valid (as long as ##3## is close to ##x_1## and ##4## is close to ##y_1##)?

\begin{equation*}
f(3,4) = f(x_1, y_1) + \left[ \frac{\partial f}{\partial x}(3-x_1) + \frac{\partial f}{\partial y}(4-y_1) \right] + \dots
\end{equation*}
all derivatives evaluated at ##(x_1, \,y_1)##.
 
Last edited:
  • #10
Yes, it seems to work. Possible concerns are:
##\beta_k## is used as both variable in partial derivative formula and their specific values in ##\vec{\beta}##. (Below I used \boldsymbol font for the variable but it does not make a big difference in appearance.)
##(\triangle x)_i ## might be better than ##\triangle x_i## to avoid possible misinterpretation of ##\triangle## multiplied by ##x_i##
\frac{\partial f}{\partial x}(x_i;\vec{\beta}):=\frac{\partial f}{\partial x}|_{\vec{\beta}}(x_i)
\frac{\partial f}{\partial \boldsymbol{\beta}_k}(x_i;\vec{\beta}):=\frac{\partial f}{\partial\boldsymbol{\beta}_k}|_{x_i,\beta_1,...\beta_{k-1},\beta_k,...,\beta_n}(\beta_k)
 
Last edited:
  • #11
hotvette said:
But, what if I wanted to know f(x,y) for specific values of x and y, say x=3 and y=4. Isn't the following valid (as long as 3 is close to x1 and 4 is close to y1)?

f(3,4)=f(x1,y1)+[∂f∂x(3−x1)+∂f∂y(4−y1)]+…
all derivatives evaluated at (x1,y1).
In order to avoid possible misinterpretation I would like to write it as
f(3,4)=f(x_1,y_1)+(3-x_1)\frac{\partial f}{\partial x}(x_1,y_1)+(4-y_1)\frac{\partial f}{\partial y}(x_1,y_1)+...
where (A)g(B) means (A) multiplied by g(B) where B is variable(s) of function g.
 
Last edited:
  • #12
Thanks for the discussion. It was very helpful.
 
  • #13
I think I found a solution to the dilemma. A caution was posted earlier not to confuse variables and values, but the point didn't really hit me until now. I was using ##\beta_k## as a variable but ##x_i## as a specific value of ##x##. I think what resolves the inconsistent use of terminology is to define a set of functions ##f_i = f_i(x_i; \vec{\beta})##, where ##x_i## and ##\beta_k## are variables, ##\vec{\beta} = (\beta_1, \beta_2, \dots, \beta_{k-1}, \beta_k, \beta_{k+1}, \dots, \beta_n)##, and ##\vec{x} = (x_1, x_2, \dots, x_{i-1}, x_i, x_{i+1}, \dots, x_m)## . Letting subscripts ##p## and ##p+1## represent iterates of specific values of the variables, the Truncated Taylor Series expansions of each ##f_i## might be:
\begin{align*}
f_1 &\approx f_1(x_{1p};\vec{\beta}_p) + \Delta x_1 \cdot \frac{\partial f_1}{\partial x_1} (x_{1p};\vec{\beta}_p)
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f_1}{\partial \beta_k} (x_{1p};\vec{\beta}_p)
\\
f_2 &\approx f_2 (x_{2p};\vec{\beta}_p) + \Delta x_2 \cdot \frac{\partial f_2}{\partial x_2} (x_{2p};\vec{\beta}_p)
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f_2}{\partial \beta_k} (x_{2p};\vec{\beta}_p)
\\ \vdots \\
f_m &\approx f_m (x_{mp};\vec{\beta}_p) + \Delta x_m \cdot \frac{\partial f_m}{\partial x_m} (x_{mp};\vec{\beta}_p)
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f_m}{\partial \beta_k} (x_{mp};\vec{\beta}_p)
\end{align*}
Also:
\begin{align*}
\vec{x}_p &= (x_1, x_2, \dots, x_m)_p = (x_{1p}, x_{2p}, \dots, x_{mp} ) \\
\vec{\beta}_p &= (\beta_1, \beta_2, \dots, \beta_n)_p = (\beta_{1p}, \beta_{2p}, \dots, \beta_{np} ) \\
\Delta x_i &= (x_i)_{p+1} - (x_i)_p \\
\Delta \beta_k &= (\beta_k)_{p+1} - (\beta_k)_p \\
\Delta \vec{x} &= (\Delta x_1, \Delta x_2, \dots, \Delta x_m) \\
\Delta \vec{\beta} &= (\Delta \beta_1, \Delta \beta_2, \dots, \Delta \beta_n) \\
\vec{x}_{p+1} &= \vec{x}_{p} + \Delta \vec{x} \\
\vec{\beta}_{p+1} &= \vec{\beta}_{p} + \Delta \vec{\beta}
\end{align*}
The function ##L## then becomes:
\begin{equation*}
L = \sum_{i=1}^m \left[ f_i (x_{ip};\vec{\beta}_p) + \Delta x_i \cdot \frac{\partial f_i}{\partial x_i} (x_{ip};\vec{\beta}_p)
+ \sum_{k=1}^n \Delta \beta_k \cdot \frac{\partial f_i}{\partial \beta_k} (x_{ip};\vec{\beta}_p) - \widetilde{y}_i \right]^2
\end{equation*}
where ##\widetilde{y}_i## are constants. I plan to work out a way to get rid of the double subscripts. Per the last reply on my other post, I will also consider ##x_i^{(p)}## and ##\beta_k^{(p)}## to denote the values.
 
Last edited:
Back
Top