# Relation between variables and distributions in statistics

I am a little confused about how variables are related to distributions as one moves from descriptive statistics to inferential statistics. I know that a variable in descriptive statistics is some measurable characteristic of some phenomenon, and its distribution is some description (table or graph) of how the values of this variable vary. This seems fairly comprehensible. But then I was introduced to the concept of a random variable, and its associated probability distribution. My main question, what is the difference between descriptive statistical variables and random variables, and what is the difference between a the distribution of a regular variable and a probability distribution of a random variable? They seem like analogues, but I am just not seeing the "big picture" in terms of what I am doing in statistics with these random variables, distributions, and probability distributions. If anybody could give me a clear description of how I should be thinking about all of this, it would be greatly appreciated.

micromass
Staff Emeritus
Homework Helper
The big difference between descriptive and inferential statistics is time. I mean this: descriptive statistics happens after all the measurements are made, inferential statistics happens before all the measurements are made. As such, descriptive statistics just describe the system, while inferential statistics tries to predict the system.

So a variable in descriptive statistics is pretty logical: it is some quantity that has been measured and that we have certain measurements for. Random variables are a lot harder since the measurement has not yet been made. Again, random variables are certain quantities. But now we must prepare ourselves for all possible outcomes of the experiment! So a random variable measures all possible outcomes of a measurement and the probability distribution gives the probabilities for these outcomes. The idea is that we then do an experiment and get certain outcomes. These outcomes can be described with descriptive statistics and we hope that the distribution (in the descriptive sense) agrees with the probability distribution.

The big difference between descriptive and inferential statistics is time. I mean this: descriptive statistics happens after all the measurements are made, inferential statistics happens before all the measurements are made. As such, descriptive statistics just describe the system, while inferential statistics tries to predict the system.

So a variable in descriptive statistics is pretty logical: it is some quantity that has been measured and that we have certain measurements for. Random variables are a lot harder since the measurement has not yet been made. Again, random variables are certain quantities. But now we must prepare ourselves for all possible outcomes of the experiment! So a random variable measures all possible outcomes of a measurement and the probability distribution gives the probabilities for these outcomes. The idea is that we then do an experiment and get certain outcomes. These outcomes can be described with descriptive statistics and we hope that the distribution (in the descriptive sense) agrees with the probability distribution.

Okay, I see. So would it be correct to say something along the lines of: Inferential statistics uses random variables and their associated probability distributions in order to theoretically idealize a certain experiment in terms of outcomes and the distribution of those outcomes? Also, another question: why do we only describe a the distribution of a random variable with a probability distribution? Why are there not other ways that are analogous to descriptive statistics, such as a frequency table?