Measuring the degree of convergence of a stochastic process

Click For Summary

Discussion Overview

The discussion revolves around determining the sample size necessary to ensure a minimum degree of convergence for estimators of a random variable, particularly in the context of stochastic processes. Participants explore various statistical methods and models to assess how increasing sample size affects the variance of estimators, with a focus on numerical implementation rather than analytical solutions.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant proposes using non-linear regression to fit a curve to the variance data points, suggesting a specific model for the curve.
  • Another participant argues that without additional information about the distribution of the random variable, reliable estimates cannot be made, highlighting potential issues with the independence of observations.
  • A later reply questions the reliability of estimates based on limited sample sizes, emphasizing the need for more data to make confident predictions about future observations.
  • Participants discuss the implications of measuring changes in estimates either as absolute differences or proportional changes, noting that the latter can lead to extreme variability.
  • One participant suggests reformulating the original question to focus on finding a sample size that ensures a specified confidence level regarding the change in estimates.
  • Another participant introduces the concept of constructing confidence intervals for the probability parameter based on observed data, providing a statistical framework for the discussion.

Areas of Agreement / Disagreement

Participants express differing views on the feasibility of making reliable estimates without additional information about the underlying distribution. There is no consensus on the best approach to determine the sample size for ensuring convergence, and various models and methods are proposed and debated.

Contextual Notes

Limitations include the dependence on the assumptions about the distribution of the random variable and the potential for non-independence among observations. The discussion also highlights the challenge of guaranteeing convergence in statistical estimates.

estebanox
Messages
26
Reaction score
0
Consider a sample consisting of {y1,y2,...,yk} realisations of a random variable Y, and let S(k) denote the variance of the sample as a function of its size; that is

S(k)=1/k( ∑ki=1(yi−)2)
for y¯=1/k( ∑ki=1 yi)

I do not know the distribution of Y, but I do know that S(k) tends to zero as k tends to infinity.

Suppose that I can increase the sample size gradually, so that I can calculate a sequence of variances {S(1),S(2),...,S(k)} for any strictly positive integer k.

I would like to determine the sample size that guarantees a minimum (arbitrary) 'degree of convergence'; in other words, I would like to determine the minimum value of k for which we expect that S(k+1)=S(k)±ϵ , for some small ϵ.

Ideas? I'm not interested in the analytical answer to this question, but rather something that I can implement numerically.

NOTE: This question has also been asked here
 
Physics news on Phys.org
If you have observed k realisations you could use non-linear regression (GLM) to fit a curve to the set of data points (1,S(1)), (2,S(2)), ...,(k,S(k)), subject to the constraint that the curve is always positive and asymptotically approaches 0. A very simple curve to fit would be the two-parameter curve ##f(n)=ae^{-bn}##. The linear model for your regression of ##y## against ##n## would then be
$$y=\log a-bn+\epsilon_n$$
where the observed ##y## values are the ##k## values of ##\log S(n)##
If you assume the ##\epsilon_n## error random variables are iid normal with st dev ##\sigma## you can use OLS regression on this to estimate ##a## and ##b##.
 
You cannot make a reliable estimate without more information. Suppose you have k=1000. and S(k) went nicely towards 0 so far. That doesn't tell you anything about k=1100, where it could be much larger, and only get smaller after k=1010 again. To guarantee anything, you need more information about yk. Clearly they are not iid, otherwise the variance would not go down.
 
mfb said:
You cannot make a reliable estimate without more information. Suppose you have k=1000. and S(k) went nicely towards 0 so far. That doesn't tell you anything about k=1100, where it could be much larger, and only get smaller after k=1010 again. To guarantee anything, you need more information about yk. Clearly they are not iid, otherwise the variance would not go down.

@mfb: I think I left relevant information when trying to explain my question. Here is a new attempt

--------
Consider a set of random variables ##(X_1,X_2,X_3,...X_k)## that are i.i.d. ##Bernoulli(p)##

While I do not know ##p## I can estimate it using
$$
Y(k)=\frac{1}{k}\sum_{i=1}^k X_i
$$

Notice that ##Y(k)## is a random variable, and its distribution has mean ##p## and variance ##\frac{p(1-p)}{k}##. So ##Y(k)## is a consistent estimator of ##p##.

**Question:**
How can I determine the sample size that guarantees a minimum (arbitrary) 'degree of convergence' of the estimator? In other words, what is the point after which we can be confident that increasing the sample size from ##k## to ##k+1## will only yield a small change in our estimate of ##p## (for any ##p##)?

One idea I had was to look at the convergence of the variance of the sequence of estimators that is obtained by increasing ##k## gradually; that is
$$S(k)=\frac{1}{k}\sum_{i=1}^k (Y(i)-\bar{Y}(i))^2$$
for ##\bar{Y(k)}=\frac{1}{k}\sum_{i=1}^k Y(i)##

Numerically, I find that ##S(k)## converges to ##0## as expected; so perhaps what I need is a condition on ##S(k)##?
 
Last edited:
In your example, going from k to k+1 has the trivial upper limit of changing the estimate by 1/(k+1). If your p is very close to 0 or 1, those changes will happen to a very good approximation, so the upper bound is already optimal in some way.

If you have a Gaussian distribution, then you can use something like ##\frac{n \sigma}{k}## as (not strict) upper estimate where n depends on the fraction of changes you want to cover. Not sure if the k in the denominator is correct, or needs something like ##\sqrt{k(k+1)}##, but for large k it works.

Similar statements should be possible for all distributions if you have some prior knowledge about the distribution. You need some knowledge, otherwise very long tails in your distribution can ruin every estimate.
 
estebanox said:
@mfb**Question:**
... what is the point after which we can be confident that increasing the sample size from ##k## to ##k+1## will only yield a small change in our estimate of ##p## (for any ##p##)?
It depends on how you measure the change in your estimate. If you measure it as (new estimate) minus (old estimate) then the problem is very easy, using what mfb points out.

But if you measure it as ((new estimate) divided by (old estimate)) minus 1 then you cannot place any limits on the size of the change because, for any k, it will always be possible that the first k observations of X were zero and the (k+1)th observation is 1, in which case the estimate of p will go from 0 to 1(k+1) in that step, which is an infinitely large proportional increase.
 
If you measure it as (new estimate) minus (old estimate) then the problem is very easy, using what mfb points out.
@andrewkirk and @mfb :
I see... but given that I don't know the true value of ##p##, solving the problem assuming the upper bound seems an extreme assumption. My point is that it seems rather wasteful to assume the worst-case scenario to solve for ##k## ex-ante, rather than using information from the interim estimates that I obtain from sequentially increasing sample size...
 
@estebanox It wasn't clear from the reformulated question in post #4 whether the bound was to be calculated ex-ante or ex-post. If it is ex-post then the answer for the differencing case remains the same but the answer for the ratio case changes to ##\frac1k## or ##\frac{1-Y(k)}{Y(k)(k-1)}## according to whether the ##(k+1)##th value is 1 or 0.

What prevents any greater accuracy than that is the word 'guarantees' in your question. Statistics deals with probabilities, not certainties, which means there are no guarantees. A more reasonable question, which might allow for a more helpful answer, would be something like:

'Given a tolerance ##\epsilon>0##, find a formula ##k## as a function of ##\epsilon## and the first ##k## values of ##X_i## (*), such that we can be at least 95% confident that for ##n\geq k## we will have ##\left|\frac{Y(n+1)}{Y(n)}-1\right|<\epsilon##?'

(*) To avoid circularity, we define ##k## as a function of ##\epsilon## and all values of ##X_i##, but require it to obey the additional condition that if the formula delivers result ##K## it makes use of no values of ##X_i## for ##i>K##. This specification is probably much more neatly expressed using conditional probabilities but I have to rush off now so don't have time to work out how to write it that way.​
 
Last edited:
andrewkirk said:
@estebanox

'Given a tolerance ##\epsilon>0##, find a formula ##k## as a function of ##\epsilon## and the first ##k## values of ##X_i## (*), such that we can be at least 95% confident that for ##n\geq k## we will have ##\left|\frac{Y(n+1)}{Y(n)}-1\right|<\epsilon##?'
Thanks @andrewkirk. Indeed, saying "guarantee" in my question was poor use of language – as a matter of fact I also say "In other words, what is the point after which we can be confident that increasing the sample size...". So your proposal to reformulate the problem is indeed helpful. It'd be great to hear any ideas as to what the answer to the reformulated problem might be (do you think it makes sense to post a new thread?)
 
  • #10
We can write a confidence interval for ##p## based on the first ##k## observations. Let ##B_{k,x}:[0,1]\to[0,1]## be the function such that ##B_{k,x}(y)## is the probability that a binomial random variable with ##k## trials and success probability ##y## is less than or equal to ##x##. Then a two-sided ##1-\alpha## confidence interval jas lower bound
$$L=B_{k,Y(k)}{}^{-1}\left(\frac\alpha2\right)$$
and upper bound
$$U=B_{k,Y(k)}{}^{-1}\left(1-\frac\alpha2\right)$$
 
  • #11
https://en.wikipedia.org/wiki/Law_of_the_iterated_logarithm

For any distribution which has zero mean and unit variance, if you take means ##S_n/n## and increase the sample size, then the bound you are looking for is
\frac{\sqrt{2n\log\log(n)}}{n}
in the following sense: it has been proven mathematically that only finitely many values for ##n## will have the property that ##S_n/n## will surpass this bound, while infinitely many values will still come close to this bound.

For the means of a Bernouilli variable, this means that the bound for ##|S_n/n - p|## is ##p(1-p)\frac{\sqrt{2n\log\log(n)}}{n}##. This is maximized for the worst case scenario of ##p = 1/2##, which gives us a bound of
\frac{\sqrt{2n\log\log(n)}}{4n}

Law_of_large_numbers.gif
 
  • Like
Likes   Reactions: estebanox and mfb
  • #12
micromass said:
For the means of a Bernouilli variable, this means that the bound for ##|S_n/n - p|## is ##p(1-p)\frac{\sqrt{2n\log\log(n)}}{n}##. This is maximized for the worst case scenario of ##p = 1/2##, which gives us a bound of
\frac{\sqrt{2n\log\log(n)}}{4n}

Thanks for this @micromass – very helpful!
 
  • #13
@micromass: I have continued working on a related problem – I have posted a new thread here. Would be great to hear your thoughts!
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
11K
  • · Replies 27 ·
Replies
27
Views
3K
  • · Replies 13 ·
Replies
13
Views
2K