Simulating a distribution in R?

  • Context: Undergrad 
  • Thread starter Thread starter moonman239
  • Start date Start date
  • Tags Tags
    Distribution
Click For Summary

Discussion Overview

The discussion revolves around simulating a variable in R that maintains the same distribution as a given dataset. Participants explore various methods, including Monte Carlo simulations and bootstrapping, while addressing the use of random number generators and specific functions in R.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants inquire about simulating a variable that holds the same distribution as a dataset in R.
  • Others suggest using the Monte Carlo method for computer simulations.
  • One participant mentions the need for a random number generator that allows specification of distribution parameters for simulations, noting that the simulated parameters will only match the original distribution on average.
  • There is a discussion about the existence of functions in R for simulating variables from known distributions, with some participants expressing uncertainty about specific functions.
  • Another participant proposes bootstrapping directly from the observed data as an alternative to estimating the distribution, mentioning the use of the sample() function and more sophisticated boot() functions.

Areas of Agreement / Disagreement

Participants express differing views on the best approach to simulate the distribution, with some favoring Monte Carlo methods and others advocating for bootstrapping. The discussion remains unresolved regarding the most effective method.

Contextual Notes

Participants reference the need for specific functions in R and the limitations of simulations matching the original distribution only on average. There is also mention of various statistical distributions and methods without consensus on the best approach.

Who May Find This Useful

Readers interested in statistical simulations, R programming, and methods for data analysis may find this discussion relevant.

moonman239
Messages
276
Reaction score
0
I have a dataset in R. What I want to do is simulate a variable that holds the same distribution. How do I do this?
 
Physics news on Phys.org
Are you interested in a computer simulation? Look up Monte Carlo method.
 
mathman said:
Are you interested in a computer simulation?

Yes

mathman said:
Look up Monte Carlo method.

I know about Monte Carlo simulations.
 
moonman239 said:
I have a dataset in R. What I want to do is simulate a variable that holds the same distribution. How do I do this?

You need a random number generator where you can specify the distribution parameters for N simulations. Of course the simulated distribution parameters will only match your template on average. I worked with simulations in Minitab where you could specify four moments of a normal distributions and also for a few others such as the Poisson and binomial. You can write your own programs by using the PDFs and MGFs with randomly generated parameters (ie simulated random sample means around a specified population mean) if you like doing that sort of thing.
 
Last edited:
SW VandeCarr said:
You need a random number generator where you can specify the distribution parameters for N simulations.

I know. Is there a function to do that in R? I know you can simulate variables from widely-known distributions (normal, Poisson, uniform, chi-square, etc.)
 
moonman239 said:
I know. Is there a function to do that in R? I know you can simulate variables from widely-known distributions (normal, Poisson, uniform, chi-square, etc.)

A good stats package should have this ability. I don't specifically know about R. Did you check commands that begin with RAND?
 
Anything else? I don't think that helped. Thanks, anyways.
 
Rather than trying to estimate the distribution from which the data was drawn, and to then use that (parameterized) distribution to simulate from(involving random number generators), you can just bootstrap sample straight from the observed data, i.e., just keep resampling, with or without replacement (with to get iid sampling) from the observed data. Look at the function: sample(), then there's more sophisticated boot() functions too.
 

Similar threads

  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 8 ·
Replies
8
Views
1K