I think I should have explained this better. I blame the late hour when I was posting last night.

Continuing with my previous notation, a general linear system may look like
$$L(y) = M(x)$$
where ##L## and ##M## are linear operators on ##S##. In your special case, ##M## happens to be the zero operator (which is of course trivially linear).
Now as long as ##L## and ##M## are both linear, we can solve the system. We might naively like to write ##y = L^{-1} M(x)##. But ##L## may not be invertible. Indeed, your system is not invertible, because ##L(y) = 0## has solutions other than ##y = 0##.
Fortunately, we can overcome this problem. You were hoping for a math/EE hybrid explanation. Let me see if I can provide one.
Let ##N## denote the set of all ##y## such that ##L(y) = 0##. We call ##N## the
null space of ##L##. We may now partition ##S## into
equivalence classes: two elements ##y_1, y_2 \in S## are considered equivalent if they differ by an element of ##N##. We denote the set of all such equivalence classes by ##S/N##. This is a standard mathematical object called a
quotient space. An element of ##S/N## is written as ##s + N##, where ##s## is some element of ##S##, and ##N## is the null space. By ##s + N## we simply mean the set of all elements of the form ##s + n## where ##n \in N##.
OK, so what do we do with this quotient space? Well, there's a wonderful theorem of linear algebra (and abstract algebra in general) called the first isomorphism theorem, which allows us to convert our non-invertible linear map ##L##, which maps from ##S## to ##S##, into an invertible linear map which we'll call ##\bar{L}##, which maps from ##S/N## to ##S##. Then instead of writing ##L(y)##, we may instead write ##\bar{L}(y+N)##, and these give the same answer, except that ##\bar{L}## has a (left) inverse. So now we may solve our system:
$$L(y) = M(x)$$
becomes
$$\bar{L}(y + N) = M(x)$$
and we may invert ##\bar{L}## to obtain
$$y + N = \bar{L}^{-1} M(x)$$
Note that the solution is of the form ##y + N## instead of ##y##. This simply means that we may add any element of ##N## to ##y## and obtain another solution. If we want a unique solution, then in general we must apply constraints (initial conditions). This is completely analogous to what we learn in elementary differential equations: the general solution consists of a particular solution plus any solution to the "homogeneous" equation ##L(y) = 0##.
Note that since ##\bar{L}^{-1}## and ##M## are both linear, so is ##\bar{L}^{-1} M##, so we have a linear relation between the input ##x## and the output ##y##.
Computing ##\bar{L}^{-1} M## is another story, since the operators are usually infinite-dimensional so they aren't representable by matrices. Often in signal processing, we work with linear operators which are also
time invariant, i.e. if we shift the input then the output is shifted by the same amount but otherwise unchanged. The action of a time-invariant linear operator on ##x## has a nice representation as the convolution of ##x## with a kernel ##h##, often called the impulse response in signal processing. With this representation, we have many standard techniques available for solution, for example Fourier analysis or more general power series representations.