I'm really confused about the idea of convolution and could really use some help understanding it. Wikipedia says:

Emphasis added.

It seems to me that since the two functions are being multiplied together and then integrated that the integral should give the product of the areas of the two functions where the two functions overlap. My interpretation is significantly different than the rest of the world's, so I guess I'm wrong?

I assume you are learning this for a class? I learned it in a Signals Processing class and we used it to find responses of linear systems to inputs using the impulse response h(t), and some input f(t).

The integral has a graphical interpretation that is light to compute by hand for many inputs and impulse responses. This leads to the "multiplication of areas" interpretation but that does somewhat disguise what's going on. The idea of the convolution integral being directly proportional to the area under one of the two functions being convolved is only true when one of the integrals is a flat function. (ie constant c or 0 for some time intervals). And indeed, this area is a product as you say between the two functions, but not between their areas. It's the area under one function, whose integrand is weighted by the other function and vice versa. I hope this helps.

Consider an example. A radar unit sends out a pulse that is reflected off a target. The transmitted pulse has a time duratation. If the target has depth, the relected pulse may be the sum of the pulse length and the target depth. If the transmitted pulse is not square and if the target presents multiple edges, the returned pulse may have a complex shape. For imaging it would be important to subtract the transmitted pulse shape from the reflected pulse shape and this would be done with deconvolution.

I believe you are wrong. Imagine each area as a series of thin vertical strips (same "time segment"). Each strip of the foreground function is weighted by the strip of the background function and these weighted results are then added (summed) by the integral. This way the integral is actually the foreground area weighted in each of its value ("time segment") by the appropriate value in the background area instead of -- how you present it -- the surface of entire foreground area weighted by the surface of entire background area.

I'm quite confident that if you draw 2 functions (one triangle facing left and one triangle with different slope facing right) and calculated their convolution that the weighted strips vs. weighted areas would stand out.

I found it easier to understand convolution with time discrete functions, i.e. functions that have a value only every say 1 second. Draw one function (the "filter") with only about 5 different values, draw the other function (the "signal") with about 10 different values, flip the signal function around y-axis as you are required for convolution and start sliding them on top of each other and calculate a few values of the output function. You'll see why I recommended to create "time segment" or "time slices", you'll see how each value at each time is weighted with the same value of the filter and how they are then added/summarized to create one resulting value for that given time difference (addition is discrete-time equivalent of integral for continuous time functions).

Don't feel bad if you can't understand convolution. I think I really realized what convolution is only about 10 years after graduating from university and my studies were heavily based on signal processing It's just that sometimes some professors completely drown basic concepts in heavy mathematics.

Convolution as explained this way is how it works in time-domain. In frequency domain you just multiply the two functions (which are Fourier/Laplace images of time-domain functions) and you are done. That's why everybody calculates convolution in frequency domain

Check external links on Wikipedia for other tutorials on convolution. Maybe you'll find there something that speaks more to you.

Basically I think convolution is the summation of signal functions from one minimum value to a maximum value: (0 to 2 ) Ʃ v[n - i ] ⇔ v[n] + v[n-1] + v[n-2] . The brackets mean that we are using discrete time I think.

Also, in looking for tutorials, try to avoid examples which use constant functions ("rectangles") and linear slope triangles. I believe they may give you a wrong ideas about certain properties of convolution or its true meaning.