You mean historically?
Its a common procedure in differential equations to use some sort of variation of parameters in order to find solutions. If you ask me, after studying this mathematical objects for a while, it seems quite natural to suggest such a solution.
When you have a second order ode with constant coefficients, if the characteristic equation has repeated eigenvalues (resonance), then you are one solution short, so you propose a solution in the form xy_1(x). It is a natural step to generalize this thinking when your coefficients aren't constant, by proposing a solution of the form a(x)y_1(x) and see what does the function a must fulfill in order to span a solution.