Does higher order moments means more attention to local area?

Wenlong
Messages
9
Reaction score
0
Dear all,

Sorry to post this question in this section again.

I am currently looking into few static analyse algorithms. I noticed that they are analysing with different order moments or cumulants to analyse the data. I guess it is because these algorithms are focus on different aspect of the data itself.

So far as I know, 1st (mean) and 2nd(variance) moments are focus on the dispersion of the data as a whole, and 3rd moment (skewness) looks into the tail area of the distribution. 4th moment (kurtosis) concentrates on the peaks.

Can I then deduce that higher moments means the algorithm pays more attention to local static property?

Can anyone answer me explicitly to help me out of this headache? Or can you recommend some books or papers? I do extremely appreciate for your kind consideration and help.

Best wishes
Wenlong
 
Physics news on Phys.org
The odd and even moments tend to behave differently. Odd moments will naturally tell you things about lopsidedness (because an odd power of a negative number is negative). The mean is a measure of lopsidedness compared with a distribution more evenly placed about 0. Even moments treat both sides equally, so say more about spread.
Higher order moments put a heavier weight on the outliers. A distribution with a sharp peak and long tails will have a higher kurtosis than one with the same variance but which has a broader centre then falls off quickly.
 
Hi, Haruspex

Thank you very much for your reply. It helps alot.

Then may I ask a further question base on this? Take PCA and ICA (independent component analysis) for example, PCA compute principal components with covariance matrix (2nd order moment) while ICA compute independent components with negentropy (measured by kurtosis or higher order moments).

By comparison of principal components and independent components of same set of observations, I find that independent components are better to represent local features while principal components are better to represent global trends.

Is this because the different order of moments they use? Or it just a coincidence?

Many thanks in advance.

Best wishes
Wenlong
 
haruspex said:
The odd and even moments tend to behave differently. Odd moments will naturally tell you things about lopsidedness (because an odd power of a negative number is negative). The mean is a measure of lopsidedness compared with a distribution more evenly placed about 0. Even moments treat both sides equally, so say more about spread.
Higher order moments put a heavier weight on the outliers. A distribution with a sharp peak and long tails will have a higher kurtosis than one with the same variance but which has a broader centre then falls off quickly.

Hi, Haruspex

Thank you very much for your reply. It helps alot.

Then may I ask a further question base on this? Take PCA and ICA (independent component analysis) for example, PCA compute principal components with covariance matrix (2nd order moment) while ICA compute independent components with negentropy (measured by kurtosis or higher order moments).

By comparison of principal components and independent components of same set of observations, I find that independent components are better to represent local features while principal components are better to represent global trends.

Is this because the different order of moments they use? Or it just a coincidence?

Many thanks in advance.

Best wishes
Wenlong

BTW, how can I reply to a respondent directly in this forum?
 
Wenlong said:
Dear all,

Sorry to post this question in this section again.

I am currently looking into few static analyse algorithms. I noticed that they are analysing with different order moments or cumulants to analyse the data. I guess it is because these algorithms are focus on different aspect of the data itself.

So far as I know, 1st (mean) and 2nd(variance) moments are focus on the dispersion of the data as a whole, and 3rd moment (skewness) looks into the tail area of the distribution. 4th moment (kurtosis) concentrates on the peaks.

Can I then deduce that higher moments means the algorithm pays more attention to local static property?

Can anyone answer me explicitly to help me out of this headache? Or can you recommend some books or papers? I do extremely appreciate for your kind consideration and help.

Best wishes
Wenlong

Hey Wenlong.

Are you aware of the relationship between the moments and the characteristic probability function, and what the interpretation of the Fourier and inverse Fourier transform is with respect to frequency information?

This will help you understand the relationship between the various moments (not central moments, just moments) and the frequency information of the PDF itself.
 
Wenlong said:
Take PCA and ICA (independent component analysis) for example, PCA compute principal components with covariance matrix (2nd order moment) while ICA compute independent components with negentropy (measured by kurtosis or higher order moments).

By comparison of principal components and independent components of same set of observations, I find that independent components are better to represent local features while principal components are better to represent global trends.

Is this because the different order of moments they use? Or it just a coincidence?
You've gone beyond my limits of expertise with that one.
As far as I've been able to discern:
- PCA is often used as a preliminary (whitening) step for ICA anyway;
- ICA requires non-Gaussianity in (all but one of) the sources, whereas PCA does not;
- ICA doesn't rank the components
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.

Similar threads

Back
Top