The $i$-th coordinate of $Bx$ is:
$\sum\limits_j b_{ij}x_j$ <----this sum is a scalar, we wind up with $i$ different scalar entries of $Bx$.
Notice that for this to even make sense, the "$j$" 's must match up: $B$ must have the same number of COLUMNS, as $x$ has ROWS ($x$ is taken to be a column-vector, so it only has 1 column, considered as a matrix).
Now we "hit this with $A$". We use $A = (a_{ki})$, since, again, for this to be well-defined, we must have that $A$ has $i$ columns, as $Bx$ has $i$ entries (rows).
Now, $A(Bx)$ will have $k$ entries, the $k$-th entry will be:
$[A(Bx)]_k = \sum\limits_i a_{ki}(Bx)_i = \sum\limits_i a_{ki}\left(\sum\limits_j b_{ij}x_j \right)$.
Using the distributive law, we can rewrite the above as:
$= \sum\limits_i \sum\limits_j a_{ki}b_{ij}x_j$
and using the distributive law AGAIN, to collect all the $x_j$ terms, for each $j$:
$= \sum\limits_j \left(\sum\limits_i a_{ki}b_{ij}\right)x_j$
Now, the stuff in the parentheses is the $k,j$-th entry of $AB$ (by the definition of matrix multiplication), so we have:
$[A(Bx)]_k = [(AB)x]_k$, for each $k$.
It easier to see what is going on with specific values for $k,i,j$: so let's say $A$ is a 3x2 matrix, and $B$ is a 2x2 matrix.
So we start with a 2-vector: $(x_1,x_2)$.
Then $Bx = (b_{11}x_1 + b_{12}x_2, b_{21}x_1 + b_{22}x_2)$.<---a different 2-vector now.
Now $A(B(x)) =
(a_{11}(b_{11}x_1 + b_{12}x_2) + a_{12}(b_{21}x_1 + b_{22}x_2),
a_{21}(b_{11}x_1 + b_{12}x_2) +a_{22}(b_{21}x_1 + b_{22}x_2),
a_{31}(b_{11}x_1 + b_{12}x_2) + a_{32}(b_{21}x_1 + b_{22}x_2))$
$=((a_{11}b_{11} + a_{12}b_{21})x_1 + (a_{11}b_{12} + a_{12}b_{22})x_2,
(a_{21}b_{11} + a_{22}b_{21})x_1 + (a_{21}b_{12} + a_{22}b_{22})x_2,
(a_{31}b_{11} + a_{32}b_{21})x_1 + (a_{31}b_{12} + a_{32}b_{22})x_2)$
Note how all the "grouped terms" are entries of the matrix $AB$, for example:
$(a_{11}b_{11} + a_{12}b_{21}$ is the $1,1$-entry of $AB$, and $a_{11}b_{12} + a_{12}b_{22}$ is the $1,2$-entry of $AB$, and so on,
so that we have the above is $(AB)x$.