Here is the thing I want you to remember, and take to heart:
A similarity transform, and a change-of-basis, are the same thing (essentially).
We have two ways of looking at this: the linear transformation view, and the matrix view. One is "abstract", and one is "concrete".
Suppose $\rho \in \text{Hom}_{\ F}(V,V)$ is an isomorphism (or linear automorphism).
This means it is invertible, that is, there exists $\rho^{-1} \in \text{Hom}_{\ F}(V,V)$ such that:
$\rho \circ \rho^{-1} = \rho^{-1} \circ \rho = 1_V$
Now suppose $\phi \in \text{Hom}_{\ F}(V,V)$ is any linear endomorphism. Clearly:
$\rho^{-1}\phi\rho$ is also a linear endomorphism. What might this do?
Some things you will have to prove, before you are fully prepared to really comprehend this:
1) $\phi \in \text{Hom}_{\ F}(V,V)$ is injective if and only if for every linearly independent subset $S \subseteq V,\ \phi(S)$ is linearly independent.
2) $\phi \in \text{Hom}_{\ F}(V,V)$ is surjective if and only if for every set $T$ with $\text{span}(T) = V$, we have that $\text{span}(\phi(T)) = V$ as well.
Taken together, these two statements imply:
3) $\phi \in \text{Hom}_{\ F}(V,V)$ is an isomorphism (of vector spaces) if and only if $\phi$ maps a basis to a basis.
Note that these conditions reduce "total behavior" of $\phi$ (on all of $V$) to behavior on certain kinds of subsets (which in most of the simpler cases, are FINITE). So what we have is "labor-saving criteria". We can test for injectivity, surjectivity, or bijectivity on certain "well-chosen" subsets.
It's condition (3) that matters, here. Essentially, $\rho$ replaces one basis with another. Then $\phi$ "does its thing" (whatever linear transform it does), on the "new basis". Finally, $\rho^{-1}$ "returns us to our original basis". So, in a sense, $\phi$ and $\rho^{-1}\phi\rho$ represent "the same transformation" (hence the name "similar"), just "different bases" (which can be thought of, naively, as "coordinate systems" or "terminologies", since an isomorphism is essentially a "re-naming scheme").
Now let's look at the concrete side of things. Realize that, abstractly, "vectors are vectors", they don't care how we label them. When we attach NUMBERS (that is, field element entries) to a vector, what those numbers MEAN is "up to us". Imagine the Euclidean plane as a blank piece of paper, with just the one dot marking the 0-vector (0,0). The coordinate axes we draw, and the unit lengths we assign on them, are OUR CHOICES, they don't come with the space. We USUALLY draw them perpendicular, and "scaled the same", but this is a bit arbitrary, on our part.
As a pair, (2,3) is just a pair of numbers. As a VECTOR, we usually mean:
$(2,3) = 2v_1 + 2v_2$, where $\{v_1,v_2\}$ is a basis. WE HAVE TO SAY what $v_1,v_2$ ARE.
For example, in the polynomial space $P_1(\Bbb R) = \{a_0 + a_1t: a_0,a_1\}$, if our basis is $\{1,t\}$, then:
$(2,3)$ means $2 + 3t$.
If our basis is $\{1-t,1+t\}$, then $(2,3) = 2(1-t) + 3(1+t) = 2 - 2t + 3 + 3t = 5 + t$, which is a different polynomial.
A matrix in one basis, may have a totally different "appearance" in another basis. For example, it may be upper-triangular in one basis (everything below the main diagonal is 0), and not so in a different basis. Some bases may be easier to work with than others, depending on what kinds of calculations we are doing.
I am going to give you an example of how this works. Study it well.
Suppose we have the linear transformation $T:\Bbb R^2 \to \Bbb R^2$ given by:
$T(x,y) = (2x+4y,3x+3y)$
In the basis $\mathcal{B} = \{(1,0),(0,1)\} = \{e_1,e_2\}$ (this is called the standard basis), we have the matrix:
$[T]_{\mathcal{B}}^{\mathcal{B}} = \begin{bmatrix}2&4\\3&3 \end{bmatrix}$ (verify this!).
Note that the first column of this matrix is $[T(e_1)]_{\mathcal{B}}$, and the second column is $[T(e_2)]_{\mathcal{B}}$. This is no accident, the way the standard basis vectors (expressed IN that basis) "pick out columns" is a function of how matrix multiplication works (if we "hit them on the other side", as row-vectors, they "pick out rows").
Now $\mathcal{E} = \{(-4,3),(1,1)\} = \{v_1,v_2\}$ is ALSO a basis for $\Bbb R^2$:
It is linearly independent:
If $c_1v_1 + c_2 v_2 = 0$ that is, if: $c_1(-4,3) + c_2(1,1) = (0,0)$, so that:
$c_2 - 4c_1 = 0$
$3c_1 + c_2 = 0$
Then $4c_1 - c_2 + 3c_1 + c_2 = -0 + 0 = 0$, that is: $7c_1 = 0\implies c_1 = 0$. Clearly, we must have as well $c_2 = 0$, so $\{v_1,v_2\}$ is linearly independent. What this means is, neither $v_1$ nor $v_2$ (the possible non-empty proper subsets of $\mathcal{E}$) is expressible in terms of the other: we need them BOTH to describe linear combinations of the two.
It spans $\Bbb R^2$:
Given $(a,b) \in \Bbb R^2$ we have:
$(a,b) = \frac{1}{7}(7a,7b) = \frac{1}{7}[(4a-4b,3b-3a) + (3a+4b,3a+4b)]$
$= \dfrac{b-a}{7}(-4,3) + \dfrac{3a+4b}{7}(1,1)$
so any point in $\Bbb R^2$ is expressible as a linear combination of $v_1,v_2$.
Why would we choose to use such an unusual basis?
Let us calculate the matrix $[T]_{\mathcal{E}}^{\mathcal{E}}$. We will do this 2 ways.
First, we calculate $T(v_1),T(v_2)$.
$T(v_1) = T((-4,3)) = (2(-4) + 4(3),3(-4) + 3(3)) = (4,-3) = -v_1$.
In the basis $\mathcal{E}$ this is the linear combination:
$(-1)v_1 + 0v_2$ so $[T(v_1)]_{\mathcal{E}} = (-1,0)$.
$T(v_2) = T((1,1) = (2(1) + 4(1),3(1) + 3(1)) = (6,6) = 6v_2$. So $[T(v_2)]_{\mathcal{E}} = (0,6)$ and by our definition of $[T]_{\mathcal{E}}^{\mathcal{E}}$ we have:
$[T]_{\mathcal{E}}^{\mathcal{E}} = \begin{bmatrix}-1&0\\0&6\end{bmatrix}$.
Next, we take "the long way around". First, we need to find a matrix $P$ that take $\mathcal{E}$-coordinates to $\mathcal{B}$-coordinates. Such a matrix would take $[v_1]_{\mathcal{E}}$ to $[v_1]_{\mathcal{B}}$, that is, when it was multiplied by $(1,0)^T$, would yield $(-4,3)^T$, and similarly, for $v_2$, would take $(0,1)^T$ to $(1,1)^T$.
It doesn't take much thought to see that this matrix is:
$P = \begin{bmatrix}-4&1\\3&1 \end{bmatrix}$.
Having applied $P$ to our $\mathcal{E}$-coordinates, we are now in $\mathcal{B}$-coordinates, and may just multiply by our "old matrix" for $T$, to get $T(v)$ in $\mathcal{B}$-coordinates:
$[T]_{\mathcal{B}}^{\mathcal{B}}P([v]_{\mathcal{E}}) = [T]_{\mathcal{B}}^{\mathcal{B}}([v]_{\mathcal{B}}) = [T(v)]_{\mathcal{B}}$
Now the inverse coordinate transformation matrix is just going to be the inverse matrix $P^{-1}$ (why?). This is:
$P^{-1} = \frac{-1}{7}\begin{bmatrix}1&-1\\-3&-4 \end{bmatrix}$, and we have:
$P^{-1}[T]_{\mathcal{B}}^{\mathcal{B}}P([v]_\mathcal{E}) = P^{-1}([T(v)]_{\mathcal{B}} = [T(v)]_{\mathcal{E}}$
that is:
$P^{-1}[T]_{\mathcal{B}}^{\mathcal{B}}P = [T]_{\mathcal{E}}^{\mathcal{E}}$
since that IS the matrix which takes $[v]_{\mathcal{E}}$ to $[T(v)]_{\mathcal{E}}$
Seeing is believing:
$P^{-1}[T]_{\mathcal{B}}^{\mathcal{B}}P = \frac{-1}{7}\begin{bmatrix}1&-1\\-3&-4 \end{bmatrix}\begin{bmatrix}2&4\\3&3 \end{bmatrix}\begin{bmatrix}-4&1\\3&1 \end{bmatrix}$
$= \frac{-1}{7}\begin{bmatrix}1&-1\\-3&-4 \end{bmatrix}\begin{bmatrix}4&6\\-3&6 \end{bmatrix}$
$=\frac{-1}{7}\begin{bmatrix}7&0\\0&-42\end{bmatrix} = \begin{bmatrix}-1&0\\0&6\end{bmatrix}$
In this "unusual basis" we see that what $T$ does, is change the sign of the $v_1$ coordinate, and magnify the $v_2$ coordinate by a factor of $6$, that is, it is the composition of an axis flip, and an axis stretch, something that is not at all apparent when using the "standard axes".