I don't get this identity except the special example where the partial derivative of f(x) wrt x is a special kind of a directional derivative along the x axis, because the other components of the gradient vector cancel in the dot product with the unit vector along the direction of x.

For example, think of ##\frac{\partial f(x,y,z)}{\partial((3,6,1))} ## as ## D_t f(x+3t,y+6t,z+1t)|_{t=0} ##.

## = ( \frac{\partial f(x,y,z)}{\partial x})(3) + ( \frac{\partial f(x,y,z)}{\partial y})(6) + ( \frac{\partial f(x,y,z)}{\partial z})(1) ## where the partial derivatives are evaluated at ##(x,y,z)## since after doing the partial derivatives with the varable ##t## present in their arguments, we set ##t = 0##.

It may help to do the work using a specific function like ## f(x,y,z) = x + xy + z^2##.

I think they're just saying that these are different notations for the same concept. There is no proof involved in a notation, it's just a convention.

The idea behind a directional derivative in terms of tangents to a parametrized path is just this: Suppose you have a function [itex]f(x,y,z)[/itex] defined at different points in space that returns a real number. Suppose you have a parameterized path giving a location [itex](x(s), y(s), z(s))[/itex] as a function of a real-valued parameter [itex]s[/itex] that increases as you move along the path. Then you can combine the two to get a function from reals to reals: [itex]F(s) \equiv f(x(s), y(s), z(s))[/itex]. Since [itex]F(s)[/itex] is just an ordinary function, you can take an ordinary derivative. Using the chain rule,

where [itex]\vec{V}[/itex] is the vector with components [itex]V^x = \frac{dx}{ds}, V^y = \frac{dy}{ds}, V^z = \frac{dz}{ds}[/itex], and [itex]\nabla f[/itex] is the "covector" with components [itex](\nabla f)_x = \frac{\partial f}{\partial x}, (\nabla f)_y = \frac{\partial f}{\partial y}, (\nabla f)_z = \frac{\partial f}{\partial z}[/itex]

So the directional derivative [itex]\vec{V} \cdot \nabla[/itex] applied to a scalar field (real-valued function of position, [itex]f[/itex]) can be understood as the result of the following computation:

Find some parametrized path [itex](x(s), y(s), z(s))[/itex] such that [itex]\vec{V}[/itex] is the corresponding "tangent vector".

Compute the rate of change of [itex]f[/itex] as [itex]s[/itex] increases along the path.

"I think they're just saying that these are different notations for the same concept. There is no proof involved in a notation, it's just a convention."

No, it is a theorem (though not at all a difficult one), which is in fact proved in the same post that the above quote appears in.

At first, we can speak of the derivative of a function, say

At this point, for all we know this depends on more than just the derivative A'(0) at s=0 of the path A(s).

But then by doing the calculation in #3 ("using the chain rule"), we find out that this derivative depends only on A'(0) (and of course on f(x, y, z) near A(0)). So it is the dot product of the gradient ∇f and the tangent vector A'(0) to the path A(s) at s=0.