It's not too hard to prove it directly from the definition. The only thing that's tricky is the notation. The definition can be expressed as
\det A=\sum_P(-1)^P A_{1,P1}\cdots A_{n,Pn}
where the sum is over all permutations of the ordered set (1,2,...,n). The factor (-1)P is interpreted as +1 when the permutation is even, and -1 when the permutation is odd. Pk for k=1,2,...,n is interpreted as the number that k is mapped to by the permutation P.
We have
\det A^T=\sum_P(-1)^P (A^T)_{1,P1}\cdots (A^T)_{n,Pn}=\sum_P(-1)^P A_{P1,1}\cdots A_{Pn,n}
The idea is to prove that any given term from the right-hand side of the first equation occurs exactly once on the right-hand side of the second equation. This is sufficient to prove that the sums are equal because the number of terms is the same in both.
So let's write the given term as
(-1)^P A_{1,P1}\cdots A_{n,Pn}
and let's rearrange the order of the factors in all the terms of the second equation so that the column indices appear in the same order as in the given term from the first equation. For example, if Pn=3, we rearrange the factors of the term
(-1)^Q A_{Q1,1}\cdots A_{Qn,n}
(where Q is some permutation) so that the factor we write last is AQ3,3. Note that all factors are of the form AQm,m for some m. Our rearrangement makes sure that the column index in the kth factor is Pk, and that means that the kth factor is AQPk,Pk. So each term of the right-hand side of the second equation can be expressed as
(-1)^Q A_{QP1,P1}\cdots A_{QPn,Pn}
but the sum is over all permutations, so exactly one of those terms has Q=P-1, and that term is equal to
(-1)^{P^{-1}} A_{1,P1}\cdots A_{n,Pn}
P-1 is of course even if and only P is even, so this is equal to the given term we started with, and we're done.
You can prove all of the properties (that are mentioned in an introductory text) of the determinant directly from the definition, using the notation above, but the result \det AB=\det A \det B is of course much harder to prove than the others.