The standard proof is based on two things: the Banach "fixed point" theorem and thinking of the set of functions as a "complete metric space".
A "contraction map" is, roughly, a function f(x), from A to itself such that, for any pair, x and y, in A, the distance from f(x) to f(y) is less than the distance from x to y. In specific terms, using d(a,b) as the distance between a and b, [itex]d(f(x),f(y))\le c d(x,y)[/itex] where c is a number strictly less than 1. Applying the function f to points "contracts" the distance between them and so "contracts" all of A. One way of thinking about it is this: Apply the function to every point in A and the result is a slightly smaller subset of A. Apply again and you get a still smaller set. Keep doing that and, in the limit, A is reduced to a single point. You can then show that, for that point, x, f(x)= x: the "fixed point".
Now think of picking any single point in A. Applying applying f repeatedly, give f(x), f(f(x)), etc. inside each of those decreasing sets f(A), f(f(A)), etc. Since the sets reduce down to a single point the sequence x, f(x), f(f(x)), etc. must converge to that point. That' the idea of the standard proof of Banach's fixed point theorem. Choose any point x in A, and form the sequence, x, f(x), f(f(x)), f(f(f(x))), etc. Use the "contration" property of f to show that is a Cauchy sequence. Since we are in a "complete" space, all Cauchy sequences converge- this sequence converges to some a. Since you have already proved that contraction maps are continuous, applying f to a is the same as applying f to each point in that sequence. But that just gives us f(x), f(f(x)), f(f(f(x))), f(f(f(f(x))))... the same sequence: it still converges to the same limit: f(a)= a.
That's the Banach fixed point theorem.
Now suppose we have the differential equation problem dy/dx= f(x,y), y(x0)= y0. If we KNEW y, we could integrate both sides and get
[tex]y= \int_{x_0)^x f(t,y(t))dt+ y_0[/itex]. Of course we don't know y but the point is that any y(x) that satisfies one must the other. The solutions to the differential equation are exactly the same as the solutions to the <b>integral equation</b>. We can prove that the solution differential equation problem exists and is unique by showing that the solution to the integral equation exists and is unique.<br />
<br />
That's important because integrals are "better behaved" than derivatives. If I take a differentiable function and differentiate it, the result may not be differentiable. (Example: y= x<sup>2</sup> if [itex]x\e 0[/itex], y= -x<sup>2</sup> if x< 0. The derivative of that exists for all x and is y'= |x|. But of course, |x| is not differentiable at x=0.) <br />
<br />
On the other hand, if f is an integrable function, its integral is also (in fact, it's "smoother": f may not be continuous but its integral is). That means we can start a set of functions and apply the integral over and over again.<br />
<br />
Lispschitz property: A function, f, is said to satisfy a Lipschitz property on a set A if and only if [itex]|f(x)- f(y)|\le C|x-y|[/itex] for some positive number C (not necessarily less than 1). It is easy to show that if f is "Lipschitz" on a set it is continuous at each point in the set. Also, if f is differentiable at each point, you can use the mean value theorem to show it is "Lipschitz" on that set. However, there exist Lipschitz functions that are not differentiable and continuous functions functions that are not Lipschitz. <br />
<br />
Now suppose f(x,y) in the differential equation above is continuous and "Lipschitz in y" for some neighborhood of (x<sub>0<sub>,y<sub>0</sub>). We convert from the differential equation dy/dx= f(x,y) to the corresponding integral equation <br />
[tex]y= \int_{x_0}^x f(t,y(t))dt[/tex] <br />
and use that to define the "operator" <br />
[tex]F(y)= y_0+ \int_{x_0}^x f(t,y(t))dt[/tex]<br />
For each function y, F(y) gives a function. We reduce from "neighborhood of (x<sub>0</sub>,y<sub>0</sub>)" to a rectangle containing that point (we can always do that). Use the continuity of f to show that f maps some set of functions on the rectangle to itself (we may need to reduce the size of the rectangle to do that). Use the "Lipschitz" property to show that F is a "contraction map". Again, we may need to reduce the rectangle to do that, but still we have F a contraction map from that rectangle to itself. <br />
<br />
Now, it is know that the set of all (integrable) functions on a rectangle forms a complete metric space. We apply the Banach fixed point theorem to show that there exist a unique function, y, such that F(y)= y. That is, there exist a unique y such that<br />
[tex]y(x)= \int_{x_0}^x f(t,y(t))dt[/tex]<br />
Since the integral equation has a unique solution (in some, perhaps small, interval about x<sub>0</sub>), it follows that the differential equation has a unique solution.</sub></sub>[/tex]