So let's try from the beginning of Yang-Mills theory, which is the classical basis for the standard model once quantised, but let us keep things classical here.
Yang-Mills theory is a classical gauge theory based on some Lie group ##\mathcal G##. In many cases, it will be useful to go back to the Abelian case ##\mathcal G = U(1)##, which is just classical electromagnetism, to try to figure out what is going on. For simplicity, we will only work with a single scalar field ##\Phi## and the gauge field ##A_\mu##.
We start by assuming that ##\Phi## is a section of the fibre bundle with the base space being spacetime and the fiber being a vector space on which we have a natural action by some representation of the gauge group ##\mathcal G## (by "scalar" we are referring to the coordinate transformation properties, not the gauge group transformation properties). This is typically referred to as ##\Phi## "being in the ##x## representation of ##\mathcal G##", with ##x## being something that identifies the representation. The naive Lagrangian for just a scalar field that is not in some vector space, but only
$$
\mathcal L_\Phi = |\partial_\mu \Phi|^2 - m^2 |\Phi|^2 + V(\Phi),
$$
where ##V(\Phi)## is some potential. What we are really interested in here is the kinetic term (the one with the space-time derivatives). There is now the typical problem of interpreting what ##\partial_\mu \Phi## would mean as the standard definition of derivatives would require taking the difference between the values of ##\Phi## at different points and taking the limit when the points approach each other. To resolve this, we need a connection field ##A_\mu## that is a linear map from the fiber at one point to itself and the corresponding covariant derivative is ##D_\mu = \partial_\mu + \rho(A_\mu)##, where ##\rho## is a representation of the gauge group Lie algebra on the representation space. In essence, the first term (the derivative) represents a change in the components of the fiber whereas the connection field term represents how the bases at nearby points connect to each other. It is quite common not to write out ##\rho##, but instead just write ##A_\mu##, which I think is what has caused some confusion in this case.
The commutator of the covariant derivatives ##F_{\mu\nu} = [D_\mu,D_\nu]## represents the curvature form and, being a commutator of linear operators on the fiber, is a linear operator on the fiber with the additional property of not depending explicitly on derivatives of the scalar field components.
Let us consider a few special cases. First up, the Abelian case ##\mathcal G = U(1)##, i.e., electromagnetism. Since it is Abelian, ##U(1)## only has one-dimensional irreps and any scalar field ##\Phi## just is a section of the fiber bundle with the fiber being the complex numbers ##\mathbb C##. Linearly mapping from ##\mathbb C## to ##\mathbb C## means that ##A_\mu## is just a number (for each ##x##) and since numbers commute and the derivatives commute
$$
[D_\mu,D_\nu]\Phi = [\partial_\mu + A_\mu,\partial_\nu + A_\nu]\Phi = (\partial_\mu A_\nu - \partial_\nu A_\mu)\Phi
$$
for all ##\Phi## (note that ##[\partial_\mu, A_\nu]\Phi = \partial_\mu(A_\nu\Phi) - A_\nu \partial_\mu \Phi = (\partial_\mu A_\nu)\Phi##).
Now to the non-Abelian case. Let us look first at the case where ##\Phi## is in the fundamental representation of some matrix group ##\mathcal G##. The covariant derivative is then actually ##D_\mu = \partial_\mu + A_\mu## with ##A_\mu## being in the Lie algebra of ##\mathcal G## (i.e., representing the Lie algebra on itself). The commutator of the covariant derivatives is then (again, the partial derivatives commute)
$$
[D_\mu, D_\nu]\Phi = ([\partial_\mu, A_\nu]+[A_\mu,\partial_\nu]+[A_\mu,A_\nu])\Phi.
$$
The commutators between ##\partial_\mu## and ##A_\nu## follow in the same way as before and this is therefore
$$
[D_\mu,D_\nu] = \partial_\mu A_\nu - \partial_\nu A_\mu + [A_\mu,A_\nu],
$$
which is going to be in the Lie algebra of ##\mathcal G## because each individual term is.
Finally, let us look at the case when ##\Phi## is in the adjoint representation, i.e., it takes values in the Lie algebra of ##\mathcal G##. For clarity, let us write ##\Phi## when we refer to the matrix and ##\tilde \Phi## when we refer to its representation as a column vector given some basis of the Lie algebra. The covariant derivative is now
$$
\widetilde{D_\mu\Phi} = D_\mu \tilde\Phi = \partial_\mu \tilde \Phi + \operatorname{ad}(A_\mu) \tilde \Phi
$$
or, equivalently,
$$
D_\mu \Phi = \partial_\mu \Phi + [A_\mu,\Phi].
$$
Using the latter, we find that
$$
D_\mu D_\nu\Phi = \partial_\mu D_\nu \Phi + [A_\mu,D_\nu\Phi]
=\partial_\mu\partial_\nu \Phi + \partial_\mu [A_\nu, \Phi] + [A_\mu,\partial_\nu\Phi] + [A_\mu,[A_\nu,\Phi]].
$$
Noting that, by the product rule for derivatives, ##\partial_\mu[A_\nu,\Phi] = [\partial_\mu A_\nu,\Phi] + [A_\nu,\partial_\mu\Phi]##, we therefore end up with
$$
D_\mu D_\nu \Phi =
\partial_\mu\partial_\nu \Phi + [\partial_\mu A_\nu,\Phi] + [A_\nu,\partial_\mu\Phi] + [A_\mu,\partial_\nu\Phi] + [A_\mu,[A_\nu,\Phi]].
$$
The terms ##\partial_\mu\partial_\nu \Phi + [A_\nu,\partial_\mu\Phi] + [A_\mu,\partial_\nu\Phi]## are symmetric in ##\mu \leftrightarrow \nu## and will therefore disappear when we take the commutator between the covariant derivatives. We find that
$$
[D_\mu,D_\nu]\Phi = [\partial_\mu A_\nu,\Phi]+[A_\mu,[A_\nu,\Phi]] - [\partial_\nu A_\mu,\Phi] - [A_\nu,[A_\mu,\Phi]],
$$
which by the Jacobi identity for the commutators can be rewritten
$$
[D_\mu,D_\nu]\Phi = [\partial_\mu A_\nu - \partial_\nu A_\mu + [A_\mu,A_\nu],\Phi] = [F_{\mu\nu},\Phi].
$$
If we were to write this for ##\tilde\Phi## instead we would have
$$
[D_\mu,D_\nu]\tilde\Phi = \operatorname{ad}(\partial_\mu A_\nu - \partial_\nu A_\mu + [A_\mu,A_\nu]) \tilde\Phi = \operatorname{ad}(F_{\mu\nu}) \tilde \Phi,
$$
where ##\operatorname{ad}(F_{\mu\nu})## is a matrix and ##\tilde\Phi## a column vector. Now, physicists usually do not bother to write out the ##\operatorname{ad}## or to make a notational distinction between ##\Phi## and ##\tilde\Phi## and write the last step in both of those equations as ##F_{\mu\nu}\Phi##. The reason for this is that in the more general case ##F_{\mu\nu}## is seen as a matrix acting on the column vector that represents the fiber regardless of what the representation is.
In both cases, it represents the action of an element in the Lie algebra on ##\Phi##.
Does this clear things up a bit?
Edit: Fixed mistakingly writing ##\mathbb G## instead of keeping consistently to ##\mathcal G##.