However I haven't got a gutt feeling for it, I need these questions answering before I can accept it:

Where is the definition derived from?

Why does adding the partial derivatives tell us the direction of maximum gradient?...I know this sounds stupid but if a function has gradients of 4,5,6 (x,y,x) what does that exactly mean? and why does adding them up points in the direction of the greatest rate of increase ?

Can someone explain this in simply laymens terms to me, preferably using an example...

I thank you in advance guys and gals