Which is indeed ##u(x)=x^2+1##, that is a function, that depends on the variable ##x##.
Which is due to the chain rule. We have ##[f(g(x))]' = f\,'(g(x))\cdot g'(x)##. Now write ##u(x)=u## instead of ##g(x)##.
No. We just changed horses and adjusted the way to ride to the kind of horse we have. In both cases they signal the integration variable, but as we didn't simply wrote ##u## for ##x## but substituted an entire expression in ##x## by just a ##u##, we had to adjust ##dx## as well. Maybe the better comparison is, that we changed the scaling on our ##x-## axis and this affects the measurement of our width, which had been in ##x-## units and is now in ##u-##units. This is different from just renaming it. Afterwards, we end up with an expression, where ##du## denotes the integration variable again, now along the ##u-##axis.
Edit: Let's take an example. We want to calculate the distance traveled in time ##0## to ##60\,sec## at a velocity of ##2\,m## per second. Then we have
$$
distance \,= \int_{t=0}^{t=60} v(t)\,dt = \int_{0\,sec}^{60\,sec} 2\,\frac{m}{sec} dt = 2\,m \cdot (60 - 0) = 120\,m
$$
Now change the seconds to minutes. You cannot go ahead with ##dt## measured in seconds anymore, you'll have to rescale it to minutes, too, for otherwise the units wouldn't fit anymore.