OmegaKV said:
Consider this equation:
f(x(t),y(t))=2(x(t))^2+x(t)y(t)+y(t)
One way to calculate df/dt is directly using the chain rule:
\frac{df}{dt}=4x(t)\frac{dx}{dt}+\frac{dx}{dt}y(t)+\frac{dy}{dt}x(t)+\frac{dy}{dt}
\frac{df}{dt}=(4x(t)+y(t))\frac{dx}{dt}+(x(t)+1)\frac{dy}{dt}
Another way is by using the formula for the total derivative:
\frac{df}{dt}=\frac{\partial f}{\partial x}\frac{dx}{dt}+\frac{\partial f}{\partial y}\frac{dy}{dt}
\frac{\partial f}{\partial x} = 4x+y
\frac{\partial f}{\partial y} = x+1
\frac{df}{dt} = (4x+y)\frac{dx}{dt}+(x+1)\frac{dy}{dt}
I see how the formula for total derivatives should work since it is the multivariable analog of the derivative, but is there any way to logically derive formula for the total derivative from single variable calculus (just using chain rule, product rule, etc.), without having to visualize things in "3D"? It seems like you should be able to since f(x(t),y(t))=f(t) is really just a single variable function.
Not quite. The definition of the derivative of a multivariable function is slightly different to the standard definition of the derivative for a single-variable function. They coincide when the multivariable function in question is actually a single-variable function, of course, but the formula you want depends on getting the derivative from the multivariable expression for the function's value, instead of just taking the ordinary derivative of the single-variable expression for the function, so you will need to use the multivariable definition of the derivative.
The definition we usually use is the unique linear function of displacement vectors attached to the input that vanishes with the the function f at the same rate as linear variation in the domain of f. That is extremely vague and probably ambiguous, so it is preferable to have a strict mathematical expression that has only a single interpretation. We interpret the derivative at a point ##(a, b)## in the domain of ##f## to be the unique linear function ##D##, whose domain is the tangent space at ##(a, b)## (a fancy name for the space of all possible displacement vectors from ##(a, b)##), which satisfies the following limit:
\lim_{(h_1 , h_2 )\to (0, 0)} \frac{|f(a+h_1 , b+h_2 ) - f(a, b) - D(h_1 , h_2 )|}{|(h_1 , h_2)|}
which is what we mean by vanishing linearly with f.
From linear algebra, we know that if D is a linear transformation between two finite dimensional vector spaces, and we choose a basis for each vector space, then the action of D is equivalent to matrix multiplication by a particular matrix of numbers.
If you use the standard basis for ##R^2##, the domain of f, and the standard basis for R, the codomain of f, then the matrix form of D at the point ##(a, b)##, if the derivative exists there, is:
\left[ \begin{array}{c} \left.\frac{\partial f}{\partial x}\right|_{(a, b)} \\ \left.\frac{\partial f}{\partial y}\right|_{(a, b)} \end{array}\right]
When you look at the multivariable definition of the derivative, and consider your question, you will find that what you want is the chain rule for a multivariable function. If you go through the definition, you will find that the chain rule takes the same form as for single-variable functions. If f is a function of g and g is a function of t, then ##D[f \circ g](a) = D[f](g(a)) \cdot D[g](a)##. We can use the multiplication ##\cdot## if we write each derivative as a matrix with respect to the same pair of domain and codomain bases.
In your case, f is a function of g, where g(t) = (x(t), y(t)). To apply the chain rule when t = a, we therefore need D[f](g(a)) and D[g](a). D[f](g(a))= D[f](x(a), y(a)) is, with respect to the standard bases, equivalent to multiplication by the matrix
\left[ \begin{array}{c} \left.\frac{\partial f}{\partial x}\right|_{(x(a), y(a))} \\ \left.\frac{\partial f}{\partial y}\right|_{(x(a), y(a))} \end{array}\right]
Likewise, D[g](a) is equivalent to multiplication by the matrix
\left[ \begin{array}{cc} \left.\frac{\partial x}{\partial t}\right|_{a} & \left.\frac{\partial y}{\partial t}\right|_{a} \end{array}\right]
The different matrix dimensions are due to the fact that the domain of g is R and the codomain is ##R^2##, so the derivative is a linear transformation associating displacement vectors at a in R with displacement vectors at g(a) in ##R^2## (called the cotangent space).
Therefore, the matrix form of our derivative ##D[f \circ g](a) = D[f](g(a)) \cdot D[g](a)## with respect to the standard bases is:
\left[ \begin{array}{c} \left.\frac{\partial f}{\partial x}\right|_{(x(a), y(a))} \\ \left.\frac{\partial f}{\partial y}\right|_{(x(a), y(a))} \end{array}\right]\cdot \left[ \begin{array}{cc} \left.\frac{\partial x}{\partial t}\right|_{a} & \left.\frac{\partial y}{\partial t}\right|_{a} \end{array}\right] = \left.\frac{\partial f}{\partial x}\right|_{(x(a), y(a))}\left.\frac{\partial x}{\partial t}\right|_{a} + \left.\frac{\partial f}{\partial y}\right|_{(x(a), y(a))}\left.\frac{\partial y}{\partial t}\right|_{a}
Since x(t) and y(t) are actually single variable functions, the partial derivatives are equivalent to the ordinary derivatives, as you have in your expression.