Question A) My understanding is that if we take system A and perform the partial trace over system B, we essentially remove any dependence or correlation of the system A on system B. For example, I have heard reference in the literature to 'performing a partial trace over the environment' in order to essentially remove the influence of the environment on the system being analyzed.

Is this intuitive understanding correct?

Can anyone provide a more concrete understanding of why the trace is interesting? Since it is simply the sum of the diagonal elements, can you explain why it is interesting/useful in the following two cases, and what it represents?

1) A diagonalized matrix (the diagonals are the eigenvalues of the system)

2) A Hermitian but non-diagonalized matrix

Question B)

Furthermore, let's say we have a density operator, which may have non-diagonal elements, but has trace equal to 1 (because it is normalized). (Am I correct so far?)

1) What is the meaning of taking the trace, and disregarding the off-diagonal elements? Why are we allowed to disregard the rest of the matrix?

2) Will the trace change if we change the basis of the density operator?

You may find that my questions belie misunderstandings in the nature of some of these concepts, which is why I appreciate any help that can be provided!