Sorry I'm being a bit terse, I'll try to explain what I'm doing more fully.
We start with a set of two by two matrices satisfying the algebra (I'm not going to require them to be hermitian, but you could). What we want to prove is that there is a similarity transformation S (again, not required to be unitary) such that S\alpha_i S^{-1} are in the standard form of the Pauli matrices. The vital point is that the algebraic relations are unchanged under similarity transformations.
The way we do this is by slowly building up the Pauli matrices one at a time, and constructing S as a product.
At the first stage, we pick a similarity transform S_1 such that S_1\alpha_3 S_1^{-1} is the diagonal Pauli matrix \sigma_3. But there is nonuniqueness in this choice, in that we can pick S_2 such that S_2\sigma_3 S_2^{-1}=\sigma_3.
Now at the second stage, we want to put a second matrix into standard form. But any further similarity transformation we do must not disturb the matrix we have already constructed, so it must be of the form S_2 (commute with \sigma_3). This is what I meant when I said 'residual transformation'.
Specifically, we find that S_2 must be diagonal. Since the overall scale of the similarity transformation is irrelevent, we can pick it to have the form \mathrm{diag}(\lambda,1) for some nonzero \lambda. The algebra, as you have said, constrains \alpha_1 to be of the form
<br />
\begin{pmatrix}<br />
0 & a \\<br />
a^{-1} & 0<br />
\end{pmatrix}<br />
and by calculation applying similarity transformation S_2 to this just rescales a\to\lambda a. So pick \lambda=a^{-1} and you get the Pauli matrix \sigma_1.
Now the only similarity transformations that fix both the matrices we have constructed are proportional to the identity, so we have no freedom left. But the algebra constrains the third matrix to be \sigma_2 so we've proved what we set out to. The transformation S=S_2S_1.