Understanding Percentiles and Their Applications in Statistical Analysis

  • Thread starter Thread starter torquerotates
  • Start date Start date
  • Tags Tags
    Concepts
AI Thread Summary
The discussion clarifies the concept of percentiles in relation to a sample of independent and identically distributed (iid) random variables. For a sample of three values, the 80th percentile corresponds to the largest value, while the 40th percentile is the smallest. When dealing with a larger sample, such as ten random variables, the 40th percentile would indeed be the 4th largest value. The conversation also distinguishes between estimating percentiles from a modeled distribution versus calculating them directly from the data. Each approach has its own advantages and disadvantages depending on the analytical goals.
torquerotates
Messages
207
Reaction score
0
Sorry about asking such a basic question but I'm having a brain fart. So if I have a sample of 3 iid random variables X1, X2, X3, I know the median is just the middle value. So does that mean that the 80th percentile is the third largest one and the 40th percentile is the smallest one?

If i have 10 random variables, would the 40th percentile be the 4th largest one?
 
Physics news on Phys.org
Your sample contains only three values? Then, yes, because 80% is larger than 50%, the "80th percentile" is just the largest value and, because 40% is less than 50%, the "40th percentile is the smallest value.
 
torquerotates said:
Sorry about asking such a basic question but I'm having a brain fart. So if I have a sample of 3 iid random variables X1, X2, X3, I know the median is just the middle value. So does that mean that the 80th percentile is the third largest one and the 40th percentile is the smallest one?

If i have 10 random variables, would the 40th percentile be the 4th largest one?

To answer this a bit more thoroughly you need to ask whether you are assuming that the distribution come from a model and you are trying to get percentile information for a distribution that's parameters are estimated from the data, or whether you want to treat your data in a distribution free context and compute the actual percentiles from the data.

If number 1 is the case, then you estimate the parameters of the distribution often using a valid point estimate, and then use the definition of the PDF to get your percentiles (you may have to solve this numerically, like in the case of the normal distribution or chi-square as a few examples).

In case 2, then you will have to basically sort all of your values, generate a histogram structure and do the same thing as above, except with your histogram and not an assumed model.

Both have advantages and disadvantages depending on what you are trying to do.
 
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...
Back
Top