1, 2, and 4 are valid intuitive ideas about the meaning. I've no clue what 3 should even mean.
The best physically intuitive argument of the definition of the operator "divergence on a vector field" is to consider a fluid flow. Now put any volume ##V## into the flowing fluid. It's boundary surface is denoted by ##\partial V##. At any point of an infinitesimal surface area at this boundary surface you can define the surface-normal vector ##\mathrm{d}^2 \vec{f}##, which has the direction perpendicular to the surface pointing by definition out of the volume ##V##. It's magnitude is the area of the surface element.
If the boundary surface is parametrized by two parameters ##u## and ##v## via the function ##\vec{r}(u,v)## then
$$\mathrm{d}^2 \vec{f} =\mathrm{d} u \, \mathrm{d} v \partial_{u} \vec{r} \times \partial_{v} \vec{r},$$
provided the order of ##u## and ##v## is chosen such as to make this surface-normal vectors point out of the volume (if not, simply interchange ##u## with ##v##).
Now we like to know, how much fluid flows into or out of the volume ##V##. Let's define ##\rho(t,\vec{x})## as the mass density of the fluid and ##\vec{v}(t,\vec{x})## the fluid-velocity field, which tells you how the fluid element at position ##\vec{x}## momentarily moves at time ##t##. Then, in an infinitesimal time ##\mathrm{d} t## there'll flow a mass
$$\mathrm{d} m = \mathrm{d}^2 \vec{f} \cdot rho(t,\vec{x}) \vec{v}(t,\vec{v}) \mathrm{d} t$$
out of the volume through the surface element since the volume of this outgoing fluid is given by ##\mathrm{d} V=\mathrm{d}^2 \vec{f} \cdot \vec{v} \mathrm{d} t##, and the corresponding mass is ##\mathrm{d} m = \rho \mathrm{d} V##. Note that ##\mathrm{d} m>0## if the angle between ##\vec{v}## and ##\mathrm{d}^2 \vec{f}## is in ##[0,\pi/2]##, meaning that then the flow through the surface element is indeed outgoing; otherwise the fluid element flows into the volume and is thus counted negative.
Now the total mass per unit time is
$$\dot{m}=\int_{\partial V} \mathrm{d}^2 \vec{f} \cdot \vec{j}.$$
Here I've interduced the current density ##\vec{j}= \rho \vec{v}##. Which is defined such that for any surface element ##\mathrm{d}^2 \vec{f}## the amount of mass flowing through this surface element per unit time is ##\mathrm{d}^2 \vec{f} \cdot \vec{j}##.
If you want to know how much mass per unit time flows through the surface of an infinitesimal volume at a given point you must just shrink the volume ##V## to this point. Of course this amount will go to 0. So a better measure is to ask for how much mass will run out of the smaller and smaller volume at this point per unit of the volume. This leads to the definition of the divergence of a vector field.
$$\text{div}\, \vec{j}(t,\vec{x}) = \lim_{V \rightarrow \{\vec{x} \}} \int_{\partial V} \mathrm{d}^2 \vec{f} \cdot \vec{j}.$$
In this definition you can choose a little cube with its edges parallel to the coordinates axes of a cartesian coordinate system. Then analysing the surface integral and taking the limit leads to
$$\text{div} \, \vec{j}=\vec{\nabla} \cdot \vec{j}=\partial_1 j_1 + \partial_2 j_2 + \partial_3 j_3,$$
where
$$\partial_j=\frac{\partial}{\partial x_j}$$
denotes the partial derivative wrt. the cartesian coordinate ##x_j##.
From the definition it's also clear that Gauß's integral theorem holds:
$$\int_{\partial V} \mathrm{d}^3 r \mathrm{div} \, \vec{j} = \int_{\partial V} \mathrm{d}^2 \vec{f} \cdot \vec{j}.$$
This is easily seen using the definition of the operator ##\mathrm{div}## and then evaluating the volume integral by putting a finer and finer spatial grid to evaluate the volume integral. In using the definition on each tiny volume you'll see that the corresponding surface integrals over the inner surfaces of the grids cancel pairwise from the contributions of two adjecent grid cells, and only the surface integral over the boundary surface of the volume ##V##, i.e., ##\partial V## remains.