Basically, "the larger the quotient, the smaller the kernel", and vice-versa.
Since we know that at least for SOME $R$ and an $R$-module $N$, and some extension $S$ and an $S$-module $M$ we can't injectively map $N$ to $M$ (there have been several examples shown), it is a legitimate question to ask:
"What characterizes the largest possible quotient of $N$ that can possibly work (given $R,S,N, M$)"?
So we are looking for a $R$-submodule $P$ of $N$ such that we can embed:
$N/P \to M$.
Let's call such a possible embedding $f$. What is $f$? It is an $R$-module monomorphism.
So how and why do we involve the tensor product?
Well the module axioms force upon us the $R$-bilinearity of the scalar product in $M$ as a mapping: $S \times N \to M,\ (s,n) \mapsto sn$.
And tensor products have (as a DEFINING property) that any such bilinear map factors through the tensor product.
The tensor product is two things:
1) A bilinear map $\otimes: S \times N \to S \otimes_R N$
which takes $(s,n) \mapsto s\otimes n$
2) the $S$-module that is the "target" of this bilinear map.
For the time being, let's call $sn, B(s,n)$. By the defining property of tensor products, we are guaranteed that there is an $S$-module homomorphism: $\phi: S \otimes_R N \to M$, with:
$B = \phi \circ \otimes$
Now, I claim:
$f: N \to M,\ f(n) = B(1,n)$ is an $R$-module homomorphism.
Let's check this:
$f(n + n') = B(1,n+n') = (\phi \circ \otimes)(1,n+n')$
$=\phi(1\otimes(n+n')) = \phi(1\otimes n + 1\otimes n')$
$= \phi(1\otimes n) + \phi(1\otimes n')$ (since $\phi$ is a homomorphism)
$=\phi(\otimes(1,n)) + \phi(\otimes(1,n')) = B(1,n) + B(1,n')$
$= f(n) + f(n')$, so $f$ is additive (a shorter proof uses the bilinearity of $B$).
$f(rn) = B(1,rn) = (\phi \circ \otimes)(1,rn)$
$= \phi(1\otimes rn) = \phi(r(1 \otimes n)) = r(\phi(1\otimes n))$
(since $r \in R \subseteq S$ and $\phi$ is an $S$-module homomorphism)
$=r((\phi \circ \otimes)(1,n)) = r(B(1,n)) = rf(n)$
(this also follows from bilinearity of $B$, but I wanted to show the "nuts and bolts").
So $f$ is $R$-linear.
To turn this into an injection (embedding), we want to take the $R$ quotient module:
$N/\text{ker }f$ which gives us one possible embedding:
$f_{\ast}: N/\text{ker }f \to M$.
We want to show that this embedding $f_{\ast}$ is "smaller" than the one given by the Corollary in D&F, that is, that the kernel of $\iota$ is contained in $\text{ker }f$.
Alright, let's go back to the $R$-module homomorphism $\iota$ now:
$\iota: N \to S \otimes_R N$ given by $\iota(n) = 1\otimes n$.
Suppose $n_0 \in \text{ker }\iota$. This means that:
$1 \otimes n_0 = 0$ (the 0-element of the tensor product).
Since $f(n_0) = B(1,n_0) = \phi(1\otimes n_0) = \phi(0) = 0$ (here the last 0 is in $M$), we see that the kernel of $\iota$ is indeed contained in $\text{ker }f$.
***********
It may not be clear that every embedding arises in this way. What *is* clear is that any such embedding defines an $R$-bilinear map from $S \times N \to M$ (namely the $S$-multiplication in $M$).
If $g:N \to M$ is a $R$-module homomorphism, the map:
$B'(s,n) = sg(n)$ has to be ($R$-) bilinear.
Since we have a homomorphism:
$\phi': S \otimes_R N \to M$ with:
$sg(n) = \phi'(s\otimes n) = \phi'(s(1\otimes n)) = s\phi'(1\otimes n)$
$= s\phi'(\iota(n))$, it follows that $\text{ker }\iota \subseteq \text{ker }g$.
If the kernel of $g$ is trivial (that is, if $g$ is injective), the kernel of $\iota$ has to be.
By the same token, if instead we have an $R$-module homomorphism:
$g_{\ast}: N/P \to M$ which is injective, we can define (extend this to) a homomorphism:
$g: N \to M$ by letting $g(n) = g_{\ast}(n + P)$
What is $\text{ker }g$?
(here is a good exercise for you to do:
Let $f: \Bbb Z_m \to \Bbb Z_n$ be an abelian group homomorphism. Define an abelian group homomorphism
$g:\Bbb Z \to \Bbb Z_n$ using $f$. If you want a specific $m$ and $n$ use $m = 8, n= 12$).