The point is that classical electrodynamics is a coarse-grained version of the full quantum description of macroscopic phenomena. The latter is very complicated and very often unnecessary. Thus we make some idealizing assumptions, simplifying the description of matter to the macrocopic essence, so to say.
In the process of this idealization, singular charge-density and current-density distributions are used. Let's stick to electrostatics. The Maxwell equations reduce to
$$\vec{\nabla} \times \vec{E}=0, \quad \vec{\nabla} \cdot \vec{D}=\rho, \quad \vec{D}=\epsilon \vec{E},$$
where the medium is described by one scalar function ##\epsilon=\epsilon(\vec{x})##, the dielectric constant at each point of the space filled with medium (or vacuum, where no medium is and where ##\epsilon=1## in the here used Heaviside-Lorentz units).
The charge density ##\rho## describes the charges treated as "external", i.e., not bound in the medium, whose response is mapped to the consistuent equation in terms of the dielectric function. This density can be a continuous function, giving the number of charges per volume around each point of space.
Now, if you have surfaces between two media or a medium and the vacuum, nother kind of charge density occurs. You just describe the surface as smooth and then there can surface charges. Using your coordinate system around a point, i.e., ##x,y## axes along the tangent surface of the surface and ##z## perpendicular to the surface, you describe the surface-charge distribution by a function ##\sigma=\sigma(x,y,0)## along the surface. It gives the charge per area in this point. You consider this as sharply located charge on the surface, i.e., a little distance away (no matter whether in or outside the medium) you assume the charge density to be 0. The only way to describe this is to use a Dirac-##\delta## distribution:
$$\rho(\vec{x})=\sigma(x,y) \delta(z).$$
Note that also the dimensions are correct, because ##\sigma## has the dimension charge/Area and the ##\delta## distribution the dimension 1/length, making the singular ##\rho## describing a surface-charge distribution, of the correct dimension charge/volume.
You can also consider line-distributions. This occurs when you idealize thin objects as simple lines, along which you describe the charge distribution by a density ##\lambda##. Let's use local coordinates around a point such that the ##x,y## axes span the plane perpendicular to the wire and the ##z## axis tangent to the wire at this point. Then ##\lambda=\lambda(z)##, and the charge density is described by the even "more singular" expression
$$\rho(\vec{x})=\lambda(z) \delta(x) \delta(y).$$
Finally, we have the idealization of a charge distribution located in a very small volume (small compared to the distance where I measure the electric field), which we often idealize as a "point charge". Then you simply have a point with a total charge ##Q## at, say, the origin of a coordinate system. The corresponding charge distribution is a pure ##\delta## distribution,
$$\rho(\vec{x})=Q \delta^{(3)}(\vec{x}).$$
Now, it's important not to forget, that all these singular charge distributions are idealizations, just helping to solve a problem more easily (sometimes even analytically), but that these idealizations have their limitations.
Particularly the idea of a classical point charge is flawed. The Maxwell equations finally break down for this idealization, and you cannot find a fully self-consistent description of the mechanics and electrodynamics of a classical point charge. Only approximations hold, and are quite successful to describe complicated situations, where these approximations are applicable.
Also the above given very simple consituent equation with a simple dielectric function is not the whole truth in all circumstances. E.g., for an anisotropic material like a crystal, ##\epsilon## becomes a tensor ##\hat{\epsilon}## or if you have long-range correlations you have a more complicated constituent equation like
$$\vec{E}(\vec{x})=\int \mathrm{d}^3 \vec{x}' \hat{\epsilon}(\vec{x}-\vec{x}')\vec{D}(\vec{x}').$$
Or for strong external fields, i.e., fields that come close to the strength of the internal fields holding the atoms in the material together, the here applied linear-response theory breaks down, and the relation between ##\vec{E}## and ##\vec{D}## becomes a complicated non-linear function (or functional).