To summarize what I have learned, the basic tool in finite dimensional linear algebra and matrix theory is row and column operations.  This is a computational way of changing your space by isomorphisms until the structure of your operator or matrix has been simplified enough to be visibly understandable.
In particular, two matrices are row equivalent, or left equivalent, if and only if the linear operators they represent become equal after changing the target space by an isomorphism, if and only if they have the same kernel, if and only if the matrices have the same row space, iff one can be transformed into the other by row operations, or iff they become equal after left multiplication by an invertible matrix, and finally iff they have the same row reduced echelon form.
The reduced row echelon form of a matrix is a matrix whose rows consist of a particularly nice basis for the common row space of matrices in the equivalence class; this is the unique basis that projects to the standard basis of the coordinate subspace spanned by the “pivot” coordinates.  This provides a canonical representative for the left equivalence class.  Finding solutions to the equation AX=0, i.e. finding a basis of the kernel of A, is easily done by row reducing A, since the reduced form has the same kernel, and one which is more easily found.
Row reduced r by n echelon matrices of rank r allow one to decompose the Grassmannian manifold of all r dimensional subspaces of k^n into “n choose r” cells, where each cell corresponds to the location of the r pivot columns.  The largest cell is the one with the first r columns as pivots, hence the manifold has dimension r.(n-r), the number of free entries in such a row reduced matrix.  They also allow one to put local coordinate charts on this manifold if we relax the definition of reduced to allow each of the n-r  non pivot columns to have r entries, even if they are not the last n-r columns.  Unlike the cells, these charts of course overlap.
Two matrices are right equivalent iff they have the same column space, iff the linear operators they represent have the same image, iff they can be transformed into one another by column operations, iff they become equal after right multiplication by an invertible matrix, i.e. they become equal as linear operators after an isomorphism of the source space.  A canonical representative of this class is obtained by row reducing the transpose and then transposing it back.  This has as columns a natural basis of the column space, analogous to the case above.
Two matrices A,B are (2 - sided) equivalent iff they can be transformed into one another by a combination of row and column operations, iff they become equal after (possibly) different isomorphisms of both source and target space, iff B = QAP where Q,P are invertible, iff A,B have the same rank r.  A canonical representative for this class is the diagonal matrix whose first r diagonal entries are ones and the rest zeroes.
Two square matrices A,B are similar iff they become the same after performing conjugation by some invertible matrix, i.e. iff B = CAC^(-1) for some invertible C, i.e. they become equal as operators after performing a single isomorphism of the common source and target space.  This equivalence can be determined by row and column operations performed on the associated “characteristic matrix”.  If A is a square matrix, its associated characteristic matrix is the matrix [Id.X-A] with polynomial entries.  This matrix can be diagonalized by row and column operations in the ring of polynomials, using the Euclidean algorithm.  This can be done in a unique way so that the diagonal entries successively divide one another.  Two square matrices A,B of the same size, are similar iff their characteristic matrices have the same diagonalized form.  The non constant entries on the diagonal, which characterize the similarity class, are called the invariant factors of the (similarity class of the) matrix. Thus two n by n matrices are similar iff they have the same invariant factors.
If the invariant factors of A are f1,...,fn, then the linear operator represented by the original matrix is similar to the operation of multiplication by X on the product space k[X]/(f1) x ... x k[X]/(fn).  The matrix of that multiplication operator, in the standard bases {1, X, X^2,...} for these factor spaces, is called the rational canonical form of the original matrix A.
Since multiplication by X satisfies the minimal polynomial f on the factor space k[X]/(f), it follows that the largest of the invariant factors of A is the minimal polynomial of the matrix A.  In case one can factor this polynomial into irreducible factors over the field k, one can decompose the product decomposition further into a product of space of form k[X]/(g) where each polynomial g is a power of an irreducible factor of the minimal polynomial.  This decomposition then gives rise to the jordan canonical form, after a slight tweak of the usual choice of basis.  Since multiplication by X carries each basis vector in the standard basis {1,X,X^2,...} into the next one, except for the last, a decomposition into a product of spaces like k[X]/(f) is called a “cyclic” decomposition.  The rational canonical decomposition is the cyclic decomposition with the fewest number of factors, while the Jordan decomposition is the one with the largest number of factors.
The nicest jordan form occurs when the irreducible factors of the minimal polynomial are all linear, and all occur to the first power in the minimal polynomial.  Then the jordan form is diagonal. Even though one may not be able to compute this diagonal form, when working over the real number field this case always occurs when the original matrix A equals its transpose.  Moreover in this nice case, the basis vectors making the matrix diagonal can even be chosen as mutually orthogonal, which is nice for doing geometry.
One can deduce from all this that the characteristic polynomial of A, which equals det[Id.X-A], is the product of the invariant factors of A, and its constant term is the determinant of A, and that this term is non zero if and only if A is invertible.  One can actually compute the inverse of A by row reducing the matrix [A , Id].
that’s all folks.  I guess the main difference between my old and my new point of view is that I like to focus now more on actually computable techniques, rather than the ideally simplest types of matrices (diagonal) which are impractical to compute,