I am taking a course on linear regression online and it talks about the sum of square difference cost function and one of the points it makes is that the cost function is always convex i.e. it has only one optima. Now, reading a bit more it seems that non-linear functions tend to give rise to non-convex functions and I am trying to develop some intuition behind it. So, suppose I take a random model like: $$ f(x) = w_0 x^2 + w_1 \exp(x) + \epsilon $$ And the cost function I choose is the same i.e. I want to minimise: $$ J(w_0, w_1) = (f(x) - w_0 x^2 + w_1 \exp(x))^2 $$ What is the intuition that the squared term and the exponential term would give rise to non-convexities?