Well, if you approach it from the perspective of vector spaces (as you probably are in your linalg course), then you usually define a vector space with an inner product to be a vector space V/\mathbb{R} with an operation ( , ): V\times V \longrightarrow \mathbb{R} such that \forall \ a, b, c \in V and x \in \mathbb{R},
1. (a, b) = (b, a)
2. (b + c, a) = (b, a) + (c, a)
3. (xa, b) = x(a, b)
4. (a, a) is definite positive, ie. (a, a)\geq 0 with equality iff a = 0.
We then define
\|a\| = \sqrt{(a, a)}
which we call the magnitude or length or modulus of a.
Now say we choose a, b such that \|a\| =\|b\| = 1. Then
(a-b, a-b) = (a, a) - 2(a, b) + (b, b) = \|a\|^2 - 2(a, b) + \|b\|^2 \geq 0
with the last inequality by 4., so
-2(a, b) \geq -\|a\|^2 - \|b\|^2 = -1^2 - 1^2 = -2 = -2\|a\|\|b\|
\Longrightarrow (a, b) \leq \|a\|\|b\|
and thus for any c, d[/itex] we note that<br />
<br />
\biggr\|\frac{c}{\|c\|}\biggr\| = \biggr\| \frac{d}{\|d\|} \biggr\| = 1<br />
<br />
and so by the previous argument,<br />
<br />
\frac{(c, d)}{\|c\|\|d\|} = \left(\frac{c}{\|c\|}, \frac{d}{\|d\|}\right) \leq \biggr\|\frac{c}{\|c\|}\biggr\|\biggr\|\frac{d}{\|d\|}\biggr\| = \frac{\|c\|\|d\|}{\|c\|\|d\|}<br />
<br />
\Longrightarrow (c, d) \leq \|c\| \|d\| \; \forall c, d \in V<br />
<br />
which is the well-known <b>Cauchy-Schwarz inequality</b>.<br />
<br />
From here, we very simply <i>define</i> that<br />
<br />
(c, d) = \|c\|\|d\| \cos{\theta}<br />
<br />
it's just like any other definition you ever will encounter. You can then derive all the properties of the \cos function.