- #1
aphirst
Gold Member
- 25
- 5
I wasn't sure into which category I should post this, so feel free to move it into a more appropriate place.
As part of my work I'm solving a system of nonlinear equations, of a usual form:
$$\vec{F}(\vec{X})=\begin{pmatrix}F_1(X_1, X_2, \cdots X_N) \\ F_2(\cdots) \\ \vdots \\ F_N(\cdots)\end{pmatrix}=\vec{0}$$
There's lots of literature about solving systems of nonlinear equations, and lots of discussion about "how Newton-Raphson methods can fail" and "how to counter it", usual methods being to combine the N-R step with a gradient-descent step, obtained by scalarising the function usually something like
$$f(\vec{X}) = ||\vec{F}(\vec{X})||^2$$
However, I'm struggling to find any literature which makes it explicit (either with proofs, or worked examples, in 2D would probably be the most useful) that you can't just minimise ##f(\vec{X})## in the first place, and that it becomes necessary to move to methods like N-R which explicitly handle the dimensionality and simultaneity. I mean, I know that that doesn't tend to work, and I would usually hand-wave it away by making reference to the extra structure of ##\vec{F}(\vec{X})## which is lost by "flattening" things down.
If I have some free time this week, I intend to construct a pair of explicit functions of 2 variables, which definitely have regions where they cross the F=0 axis, and then some contour plots of both functions along with the L2-norm combination, and then graph some N-R and Grad-Desc directions at some sample x,y values. I hope that doing so might make things more explicit that in some cases the ##f(\vec{X})## approach will lead you astray where the ##\vec{F}(\vec{X})## behaves.
But until then, I'd really appreciate if anyone here happens to know of any relevant literature, that they provide me some citations or links.
As part of my work I'm solving a system of nonlinear equations, of a usual form:
$$\vec{F}(\vec{X})=\begin{pmatrix}F_1(X_1, X_2, \cdots X_N) \\ F_2(\cdots) \\ \vdots \\ F_N(\cdots)\end{pmatrix}=\vec{0}$$
There's lots of literature about solving systems of nonlinear equations, and lots of discussion about "how Newton-Raphson methods can fail" and "how to counter it", usual methods being to combine the N-R step with a gradient-descent step, obtained by scalarising the function usually something like
$$f(\vec{X}) = ||\vec{F}(\vec{X})||^2$$
However, I'm struggling to find any literature which makes it explicit (either with proofs, or worked examples, in 2D would probably be the most useful) that you can't just minimise ##f(\vec{X})## in the first place, and that it becomes necessary to move to methods like N-R which explicitly handle the dimensionality and simultaneity. I mean, I know that that doesn't tend to work, and I would usually hand-wave it away by making reference to the extra structure of ##\vec{F}(\vec{X})## which is lost by "flattening" things down.
If I have some free time this week, I intend to construct a pair of explicit functions of 2 variables, which definitely have regions where they cross the F=0 axis, and then some contour plots of both functions along with the L2-norm combination, and then graph some N-R and Grad-Desc directions at some sample x,y values. I hope that doing so might make things more explicit that in some cases the ##f(\vec{X})## approach will lead you astray where the ##\vec{F}(\vec{X})## behaves.
But until then, I'd really appreciate if anyone here happens to know of any relevant literature, that they provide me some citations or links.
Last edited: