I was thinking about how to explain this better and avoid getting dragged into the details of tensor notation, and thought maybe the following approach might be more helpful:
1) Coordinates. Any event can be described by its coordinates. Names vary, typically one works in a 4-d space time with some coordinates like (t,x,y,z), one might also work in a space -time or space of lower dimensions. For this post, I'll mostly assume a 4-d space-time, and mostly use t,x,y,z as the coordinate names , except when otherwise mentioned.
2) Vectors: We call the partial derivatives with respect to said coordinates vectors. Graphically, they are typically represented by little arrows, drawn in the tangent space. Thus if our coordinates are (t,x,y,z) ##\partial_t, \partial_x, \partial_y, \partial_z## will all be vectors. The abstract properties of vectors are basically that they can be added together, and multiplied by scalars. We can do all of these with partial derivatives.
3) Covectors. Covectors are the duals of vectors. You'll occasionally see them also called one-forms. The gradient with respect to a coordinate, represented by the symbol d, is a covector. Thus if (t,x,y,z) are coordinates, dt, dx, dy, and dz are covectors. Covectors are represented graphically rather similarly to a contour map, by drawing parallel surfaces of constant coordinates. Covectors can also be added together and multipled by scalars.
Below: a graphical depection of a covector via stacked planes (left) and a vector via an arrow (right).
4) Combining (also sometimes called composing) vectors and covectors
The following can be taken as identities:
##\partial_x dx = \partial_y dy = \partial_z dz = \partial_t dt = 1##
##\partial_x dy = \partial_x dz = \partial_x dt = 0##
##\partial_y dx = \partial_y dz = \partial_y dt = 0##
##\partial_z dx = \partial_z dy = \partial_z dt = 0##
##\partial_t dx = \partial_t dy = \partial_t dz = 0##
These identities may look trivial - hopefully that makes them easy to remember. I'm not going to attempt any detailed explanation of where these identities came from here, other than to suggest that the reader curious about their origins might review their linear algebra textbook on the topic of vector spaces, the duals of vector spaces, and linear functionals.
[add]It might also be helpful to think about "row vectors" and "column vectors", the pre-tensor notation used for covectors and vectors, and to recall how the product of a row vector and a column vector is a simple scalar.
5) The metric tensor
The metric tensor we will denote by the symbol g, using index free notation. We will use g to compute the lengths of vectors, for this purpose we want to write g as a general linear combination of the products of covectors for reasons that will become apparent. The notation using (t,x,y,z) becomes tedious when writing the metric tensor, it becomes preferable to number our coordinates, letting t = ##x^0##, x = ##x^1##, y=##x^2##, z=##x^3## for instance. Then , using our definition of covectors we can write
g = ##\sum_{i,j=0..3} g_{ij} dx^i dx^j##
Example: If we go to a space of 2 dimensions for simplicity we can write
g = ##g_{00} dx^0 dx^0 + g_{01} dx^0 dx^1 + g_{10} dx^1 dx^0 + g_{11} dx^1 dx^1##
It turns out that g is symmetric, so that ##g_{ij} = g_{ji}## hence the above is equivalent to ##g_{00} dx^0 dx^0 + 2 \, g_{01} dx^0 dx^1 + g_{11} dx^1 dx^1##
Now, how do we use the above definitions to compute the length or mangitude of a vector, as per the example? The answer is if A is a vector, in index free-notation we can write the square of its length as g A A
Let's work an example in a simple case of 2 dimensions, where we have some vector A = p \partial_0 + q \partial_1. Then to get the squared length we write out the expressions for g and A as
##\left[ g_{00} dx^0 dx^0 + g_{01} dx^0 dx^1 + g_{10} dx^1 dx^0 + g_{11} dx^1 dx^1 \right] \left( p \partial_0 + q \partial_1\right) \left( p \partial_0 + q \partial_1\right) ##
Now we apply the identities that say that ##dx^i \partial_j## equals 0 if i is not equal to j, and 1 if i is equal to j. We then get the result
##p^2 g_{00} + 2 p q g_{01} + q^2 g_{11}##
GIven that we know the length and length^2 of a vector must be a scalar, we can see why the entity g that we use to find the length of a vector was written as the sum of products of covectors. The key point is that a covector and a vector can be composed to give a scalar quantity.