Why is the variance of the Parzen density estimator infinite?

In summary: The integral involved in the variance calculation does not converge in this case. Having infinite variance means that the data points are spread out over a very large range, making it difficult to make any meaningful conclusions or predictions from the data. One practical example where a distribution with infinite variance would be useful is in modeling extreme events, where the data is heavily skewed and the presence of outliers is expected. However, it is important to note that distributions with infinite variance are not commonly used in practical applications as they can lead to unreliable results.
  • #1
crbazevedo
7
0
Hello everyone,

I'm new to this forum and I'm glad to have found such a high quality resource where we can have such valuable guidance and discussions.

I've read somewhere that the variance of [tex]p(x) = {\frac{1}{n}}\sum_{i=1}^{n}\delta(x-x_i) \forall x \in \Re[\tex],

in which [tex]D_n = \left\{x_1, \cdots, x_n\right\}[\tex] independent realizations of a continuous random variable [tex]X[\tex], and [tex]\delta[\tex] is the dirac delta function, is infinite, irrespective to [tex]n[\tex] and [tex]D_n[\tex].

My questions are straightforward:

1) Why is that (my guess that the integral involved in the variance calculation does not converge)?;

2) What does it mean to have infinite variance in practical terms?;

3) Any practical examples where a distribution with infinite variance would be useful?;

Please note that I don't have a strong background in statistics.

Any help will be much appreciated.

Cheers,
Carlos
 
Last edited:
Physics news on Phys.org
  • #2
crbazevedo said:
I've read somewhere that the variance of [tex]p(x) = {\frac{1}{n}}\sum_{i=1}^{n}\delta(x-x_i) \forall x \in \Re[\tex],

in which [tex]D_n = \left\{x_1, \cdots, x_n\right\}[\tex] independent realizations of a continuous random variable [tex]X[\tex], and [tex]\delta[\tex] is the dirac delta function, is infinite, irrespective to [tex]n[\tex] and [tex]D_n[\tex].

Assuming you mean to find the variance of the random variable with density p(x), this would be infinite only if the variance of X is infinite (for example if X has the Cauchy or Pareto distribution).
 

1. What is the Parzen density estimator?

The Parzen density estimator is a non-parametric method used to estimate the probability density function of a random variable. It is also known as the kernel density estimator and is commonly used in data analysis and machine learning.

2. Why is the variance of the Parzen density estimator infinite?

The variance of the Parzen density estimator can be infinite because it is a non-parametric method that uses a kernel function to estimate the density. This kernel function has an infinite support, meaning it extends to infinity, which can result in a variance that is also infinite.

3. How does the choice of kernel function affect the variance of the Parzen density estimator?

The choice of kernel function can greatly affect the variance of the Parzen density estimator. A kernel function with a wide bandwidth can result in a larger variance, while a narrow bandwidth can lead to a smaller variance. Additionally, certain kernel functions, such as the Gaussian kernel, have an infinite support, which can also contribute to an infinite variance.

4. Can the infinite variance of the Parzen density estimator be a problem in practice?

In some cases, the infinite variance of the Parzen density estimator may not be a practical issue. However, it can pose a problem when working with small datasets or when the data has a narrow range. In these situations, the estimator may not accurately capture the underlying density and can result in biased estimates.

5. Are there any ways to address the issue of infinite variance in the Parzen density estimator?

There are several techniques that can be used to address the issue of infinite variance in the Parzen density estimator. One approach is to use a different kernel function with a finite support, such as the Epanechnikov or triangular kernel. Another method is to adjust the bandwidth of the kernel to reduce the variance. Additionally, using a larger dataset can also help reduce the impact of the infinite variance.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
815
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
706
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
780
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
659
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Precalculus Mathematics Homework Help
Replies
14
Views
1K
Back
Top