Relation between variables and distributions in statistics

In summary, the main difference between descriptive and inferential statistics is the time at which they are used, with descriptive statistics being used after all measurements are made and inferential statistics being used before. Random variables and their associated probability distributions are used in inferential statistics to predict outcomes of an experiment. This is because random variables measure all possible outcomes and the probability distribution gives the probabilities for these outcomes. While descriptive statistics can use other methods such as frequency tables, probability distributions are used for random variables as they account for all possible outcomes and their probabilities.
  • #1
Mr Davis 97
1,462
44
I am a little confused about how variables are related to distributions as one moves from descriptive statistics to inferential statistics. I know that a variable in descriptive statistics is some measurable characteristic of some phenomenon, and its distribution is some description (table or graph) of how the values of this variable vary. This seems fairly comprehensible. But then I was introduced to the concept of a random variable, and its associated probability distribution. My main question, what is the difference between descriptive statistical variables and random variables, and what is the difference between a the distribution of a regular variable and a probability distribution of a random variable? They seem like analogues, but I am just not seeing the "big picture" in terms of what I am doing in statistics with these random variables, distributions, and probability distributions. If anybody could give me a clear description of how I should be thinking about all of this, it would be greatly appreciated.
 
Mathematics news on Phys.org
  • #2
The big difference between descriptive and inferential statistics is time. I mean this: descriptive statistics happens after all the measurements are made, inferential statistics happens before all the measurements are made. As such, descriptive statistics just describe the system, while inferential statistics tries to predict the system.

So a variable in descriptive statistics is pretty logical: it is some quantity that has been measured and that we have certain measurements for. Random variables are a lot harder since the measurement has not yet been made. Again, random variables are certain quantities. But now we must prepare ourselves for all possible outcomes of the experiment! So a random variable measures all possible outcomes of a measurement and the probability distribution gives the probabilities for these outcomes. The idea is that we then do an experiment and get certain outcomes. These outcomes can be described with descriptive statistics and we hope that the distribution (in the descriptive sense) agrees with the probability distribution.
 
  • #3
micromass said:
The big difference between descriptive and inferential statistics is time. I mean this: descriptive statistics happens after all the measurements are made, inferential statistics happens before all the measurements are made. As such, descriptive statistics just describe the system, while inferential statistics tries to predict the system.

So a variable in descriptive statistics is pretty logical: it is some quantity that has been measured and that we have certain measurements for. Random variables are a lot harder since the measurement has not yet been made. Again, random variables are certain quantities. But now we must prepare ourselves for all possible outcomes of the experiment! So a random variable measures all possible outcomes of a measurement and the probability distribution gives the probabilities for these outcomes. The idea is that we then do an experiment and get certain outcomes. These outcomes can be described with descriptive statistics and we hope that the distribution (in the descriptive sense) agrees with the probability distribution.

Okay, I see. So would it be correct to say something along the lines of: Inferential statistics uses random variables and their associated probability distributions in order to theoretically idealize a certain experiment in terms of outcomes and the distribution of those outcomes? Also, another question: why do we only describe a the distribution of a random variable with a probability distribution? Why are there not other ways that are analogous to descriptive statistics, such as a frequency table?
 

What is the relationship between variables and distributions in statistics?

The relationship between variables and distributions in statistics is that variables are used to describe and measure different characteristics or attributes of a population or sample, while distributions show how those variables are distributed or spread out within a group of data. In other words, distributions provide a graphical representation of the data for a particular variable.

What is the difference between a dependent and independent variable?

A dependent variable is a variable that is being measured or observed, and its value is determined by other variables. An independent variable, on the other hand, is a variable that is changed or controlled by the researcher and is believed to have an effect on the dependent variable.

How are variables and distributions related in hypothesis testing?

In hypothesis testing, variables and distributions are related in that the distribution of the dependent variable is compared to the distribution of the independent variable to determine if there is a significant relationship between the two variables. This is done by calculating a test statistic and comparing it to a critical value from a known distribution.

What are the different types of distributions commonly seen in statistics?

Some of the most commonly seen distributions in statistics include the normal distribution, which is bell-shaped and symmetrical; the binomial distribution, which is used for discrete data with only two possible outcomes; and the exponential distribution, which is used to model the time between events in a process.

Why is it important to understand the relationship between variables and distributions in statistics?

Understanding the relationship between variables and distributions in statistics is important because it allows us to accurately interpret and analyze data. By understanding how variables are related to each other and how they are distributed, we can make informed decisions and draw meaningful conclusions from the data. It also helps us to identify patterns and trends, and make predictions about future outcomes.

Similar threads

Replies
1
Views
778
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
492
Replies
1
Views
655
Replies
24
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
458
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • General Math
Replies
6
Views
2K
Replies
16
Views
1K
Back
Top