If you look at section 4 of the
1905 paper he specifically assumes a clock which is
at rest in the inertial frame k, and which elapses a time of tau in that frame. So, we are considering two events on the clock's worldline which have a coordinate separation of 0 in the k frame (and earlier he referred to the first position coordinate of this frame with the greek letter xi, not the roman letter x), and a time separation of tau in that frame, and then figuring how this relates to the time separation t between the same pair of events in a different frame. x=vt is an equation of motion for the clock that's supposed to apply in the separate "stationary" frame K; you can tell it's a different frame because it uses t instead of tau for the time coordinate.
The point is, you can only use the time dilation equation when you are considering to events that have a spatial separation of zero in
one of the two frames you're considering; if the first spatial coordinate of the two frames are referred to as xi and x, then it can be either x=0 or xi=0, it doesn't matter (likewise if the coordinates are x and x', then it can be either x'=0 or x=0). Whichever frame is the one where the two events are co-local, the time dilation equation always takes the form:
t
noncolocal = t
colocal/sqrt(1 - v^2/c^2)
or equivalently:
t
colocal = t
noncolocal*sqrt(1 - v^2/c^2)
The second form of the equation is the one that appears in section 4 of Einstein's 1905 paper, with tau as the time separation in the frame where the events are colocal and t as the time separation in the frame where they aren't.