You have to distinguish between the solution of the discrete equations u_d and the exact solution of the continuous equations u.
When using an iterative solver, you start with an estimate of u_d and you update this estimate at every iteration, let's call this u_d^{i}. So at every iteration, you compute a new estimate of your discrete solution, and the error is e_d = |u_d^{i} - u_d|. The residual is what you get when you substitute your current estimate u_d^{i} into your discretized PDE. If your numerical method converges, your residual goes to zero. The error and the residual are related: |e| < C\cdot|R|, and a lot of study has gone into getting accurate estimates of this constant C. So when the residual is zero, you can be pretty sure that you have found the solution of the discretized PDE (within machine precision). What remains is the error e = |u_d - u|. In general, you do not know this error, although we do know that the error goes to zero when the mesh element size goes to zero, and we also know that the rate of convergence is equal to the order of the numerical method. What you can do is compute a second solution on a finer mesh and compare this to the first solution.
You now have 2 solutions, one on the coarse mesh 1: u_{d_1} on the fine mesh 2: u_{d_2}. An important observation is that if you interpolate the solution of mesh 1 (with residual R=0) to the finer mesh 2 and recompute the residual, the residual on the new mesh is not zero anymore.
If you have 3 meshes with 3 solutions, you can use Richardson extrapolation to get an estimate of the error.