Whenever we have n-1 vectors in n space, they will "generally" (i.e. if and only if they are independent) span a hyperspace, i.e. a subspace of codimension one (i.e. n-1). Thus there will generally be a complementary orthogonal space of dimension one, and often we would like to be able to find a basis of this complementary line. This is the same as finding a non trivial solution to a set of n-1 homogeneous linear equations in n variables. One way to express such a solution is by cramers rule, which gives a solution in terms of determinants. One can thus write this solution itself as a determinant of an n by n matrix whose first row is the basis vectors e1, e2, ..., en, and with the given n-1 vectors as the next n-1 rows. Then the determinant is a linear combination of the basis vectors, with coefficients which are determinants of the n choices of n-1 by n-1 dimensional cofactors formed from the last n-1 rows. In your case, when n=3, the determinant of the resulting 3by 3 matrix is the expression you wrote down as your definition of the cross product. By definition of dot product, notice also that the resulting determinant is the "dot product" of the vector of cofactors, with the "vector" whose entries are the n basis vectors. I.e. a linear combination has the same form as a dot product.
This means that if you replace the first row of your matrix with the entries of an actual vector, the resulting determinant will be the dot product of that vector with the cross product of your original n-1 vectors. I.e. the dot product between the cross product of some n-1 vectors, and another vector, is always the determinant of the nbyn matrix made up of those n vectors. That means it will be zero if the other vector is chosen to be anyone of the given n-1 vectors in the cross product, because then you are taking the determinant of a matrix with a repeated row. So it follows immediately from your definition of a cross product as a determinant, that the cross product of a collection of vectors is always perpendicular to anyone of those given vectors.
This works in any dimension, but we are used to talking only about the "product" of two things, so it is more usual to restrict attention to the cross product of two vectors in 3 space. In that case however the same discussion shows that the cross product of 2 vectors is always perpendicular to both of them. (Of course it is zero when those vectors are dependent by the same property of determinants.) When they are independent we don't really care which vector we get as a basis for the complementary space, just as long as it is not zero. We do want to know how long the one we choose is however and one can compute that this method gives one whose length is |x||y| sin(t), from which it is also clear that this is zero when the angle t is zero. Of course having chosen this one with this length we get nice relationships between the cross product and certain volumes. All this follows from the fact that determinants compute volumes. I.e. it seems natural to choose as cross product the vector whose length is determined by the given two vectors in some natural way, in fact its length equals the area of the parallelogram they span. This also shows why it equals zero when they are not independent. We also have to decide whether to choose either one vector of that length or its negative. It is natural in a right handed dominated world to choose one that has right hand orientation. That is also what comes out of this determinant definition.
So the cross product is just a succinct way to write down cramer's rule for a solution of a system of 2 homogeneous equations in 3 unknowns. It is useful for computing an orthogonal complement to a plane spanned by two vectors in 3 space, and also the plane area they span.
Another way to define the cross product of two vectors v,w in three space is to say it is the unique vector 1) perpendicular to both v and w, and 2) of length equal to the area of the parallelogram they span, and 3) when that length is non zero, it is oriented so that when appended to the vectors v,w in that order, i.e. in the order v, w, vxw, we get a right hand coordinate system.
then the problem is how to compute the coordinates of vxw in terms of the coordinates of v and w, which is solved by the formula you used as the definition of vxw. This approach is maybe more instructive, but then it is hard to explain how to make this computation, so most authors "cheat" and take the computation as the definition. That makes all the properties easy to prove, but the motivation is totally lost, hence your question. I.e. people who see your definition first, understandably have no idea what is going on. That is why math sometimes is taught badly in comparison to physics, where concepts are often respected. Well to be fair, in my personal opinion, physics books usually offer better intuition, and math books often provide more precision, so it helps to combine their lessons.