Solving Statistical Problems Without Population Data

  • Context: Graduate 
  • Thread starter Thread starter JudasIscariot
  • Start date Start date
  • Tags Tags
    Method Statistical
Click For Summary
SUMMARY

The discussion centers on inferring statistical moments of a population from a non-representative group of individuals. The participant recalls a method involving computer simulations to generate extensive data sets, allowing for the estimation of population characteristics despite the lack of complete population data. Key references include the work of Deming on sampling theory and Heckman's economic methods for analyzing self-selected data. These methodologies are essential for addressing the challenges posed by skewed data sets.

PREREQUISITES
  • Understanding of statistical moments and their significance
  • Familiarity with computer simulation techniques for data generation
  • Knowledge of sampling theory, particularly as discussed in Deming's work
  • Awareness of Heckman's methods for analyzing self-selected data
NEXT STEPS
  • Research "computer simulation for statistical inference" techniques
  • Study "Deming's Theory of Sampling" for foundational concepts
  • Explore "Heckman selection model" for insights on self-selection bias
  • Investigate "bootstrapping methods" for estimating population parameters
USEFUL FOR

Statisticians, data analysts, researchers dealing with non-representative samples, and anyone interested in advanced statistical inference techniques.

JudasIscariot
Messages
3
Reaction score
0
need some help with statistics!

My problem is this: I am given a set of data for a group of individuals and from that data I am supposed to infer the moments of that data as compared to the population as a whole. The problem is, I do not have the relevant data pertaining to the entire population, only the data as it pertains to this group.

This group does not represent a statistical and random sampling of the entire population so any statistical moments computed from this group data has no relevance to the population at large. The data is not a valid sample of the population.

Now I remember having met this problem in college before and I was able to solve it because I got hold of a mathematics journal that tackles this very problem! The method involves heavy use of the computer to generate billions of data from the available data and then use the generated data plus some statistical tricks to infer the moments of the entire population!

Unfortunately, I forgot the name of the method! As I remember it there are several methods that have been devised to solve these class of problems, but I can't even remember what those methods are called!

I hope someone can help me...
 
Physics news on Phys.org
If you're going to be running into problems like this in your work, it might be worth your while to invest in a small technical library.

I recommend Some Theory of Sampling by the great Deming (Dover) an oldie but goodie. It discusses some cases that may be relevant to your problem.

Also google on Heckman and statistics. He got the Economic pseudo-Nobel prize for developing an "economic" method of inferring population properties from just such skewed and possibly self selected data. Especially he addressed self selection (questionaires returned, etc.) and developed an analysis based on considering the rational optimization of benefit in returning or not returning the questionnaire. There are some papers online where he or his students discuss this methodology in particular cases.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 24 ·
Replies
24
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K