Probability and Statistics Questions

Click For Summary
SUMMARY

This discussion focuses on various probability and statistics questions related to real-world scenarios, including airline overbooking, cricket chirping frequency, father-son height correlations, and disease classification probabilities. Key calculations include z-scores for passenger arrivals, regression analysis for chirping rates, and probability assessments for hypertensive classifications. The participants provided detailed answers to each question, demonstrating a solid understanding of statistical concepts such as normal distribution, regression analysis, and conditional probability.

PREREQUISITES
  • Understanding of normal distribution and z-scores
  • Familiarity with regression analysis and R-squared values
  • Knowledge of correlation coefficients and their implications
  • Basic principles of conditional probability and marginal distributions
NEXT STEPS
  • Explore advanced techniques in regression analysis, including residual plots
  • Study the Central Limit Theorem and its applications in statistics
  • Learn about Bayesian probability and its relevance in medical testing
  • Investigate statistical software tools such as R or Python for data analysis
USEFUL FOR

Students, educators, and professionals in statistics, data analysis, and research fields who are looking to deepen their understanding of probability concepts and statistical methods.

His_Dudeness3
Messages
16
Reaction score
0
Probability and Statistics Questions!

Hey everyone, I've got a Statistics project due on Wednesday and I've got it all pretty much done except for a couple of questions. The ones I've got a problem with are in bold.

Question 1.

Airlines usually over-book the seats on an aircraft by a certain margin because they know from experience that some people change or do not show for their scheduled flight. Data collected for a particular Melbourne–Darwin flight showed that, on average, 230 people (with a standard deviation of 30) did arrive for their scheduled flight. The data followed a normal distribution. The aircraft has seats for 305 passengers.
a) What are the z-scores for 150, 200 and 250 arrivals?
b) What is the z-score that represents a completely filled flight? What is the probability that a randomly selected flight has enough seats available for all the people who turn up?
c) This particular flight goes every day. During one year of operation, how many times would you expect there to be more passengers than available seats? Justify your answer.
d) What proportion of flights have less than half the seats occupied?

Question 2.

Crickets make a characteristic chirping sound by rapidly rubbing their wing covers together. Researchers decided to investigate the relationship between the temperature and the frequency of the chirps. They obtained the following data from from observing a particular type of cricket:
Chirps per second: Temperature(degrees celsius):
14.5 22.1
15.5 23.2
20.0 32.1
18.5 28.0
16.4 25.1
19.7 33.0
17.1 26.3
15.8 24.8
16.7 25.1
15.9 23.9
17.0 27.2
17.8 27.6
18.9 31.0
18.1 29.7

a) Decide which one is the explanatory (predictor) variable and which is the response variable and produce the scatterplot with a line of best fit and the r-squared value (include this in your answer).
b) If the temperature is 33°C, how often would you estimate the cricket is chirping?
c) Suppose a cricket is chirping 1050 times a minute, use the line you have found to estimate the temperature at that time? Is this line the best for estimating temperature?

Question 3.

The heights of a group of fathers has mean 175 cms and SD 5 cms. The sons heights have mean 180 and SD 6. The correlation between father and son heights is 0.5
a) If a father has a height of 184cms, estimate the height of the son.
b) How tall should a father be for the estimated height of his son to be the same?
c) Why doesn’t regression to the mean imply that we all end up with the same height?

Question 4.

Suppose 95% of hypertensives (high blood pressure) and 20% of normotensives are classified as hypertensive by a blood-pressure machine. Given that 20% of the population are hypertensive, what is the probability that someone classified as hypertensive by the machine really is hypertensive?

Question 5.

Suppose an influenza epidemic strikes a city. 1000 2-parent families are surveyed. In 10% of families the mother gets the disease (includes the possibility the father does too), in 10% the father gets it (includes the possibility the mother does too) and in 2% both do.
a) What is the marginal distribution for disease among the parents?
b) What is the relative frequency the father gets the disease given the mother does?
c) What is the probability neither father nor mother gets the disease?


The Answers I've gotten for the following are as follows (includes ones I don't think I need help with):

1. (a) z(150)= -2.667 z(200)= -1 z(250)= 0.667
(b) z(completely filled flight)= 2.50, proportion of a flight having enough seats for all who show up= 0.9938 (i got this from the proportion that z<2.50 from the normal distribution table)
(c) Proportion of flights not having enough seats= 1-.9938=0.0062. I then multiplied this by 365 to get the no. of flights per year that don't have enough seats, which equals 2.263 flights/year (~3 flights/year).
(d) I got the z-score for this as -2.60 ( I used 152 as raw score, as it states 'have less than half seats occupied) and the proportion of this is 0.0047 ( ~2 flights/ year ).

2. (a) chirps/second= response variable, temperature= explanatory variable.
Regression line: y=0.4713x+4.517
(b) Using regression line, I got y= 20.1 chirps/second
(c) Converted 1050 chirps/minute to 17.5 chirps/sec by dividing 1050 by 60. Subbed this into regression line, where I got the temperature as 27.5 degrees celsius. I said, because the R^2 value wasn't greater than 0.99, it can't give an accurate estimate. I was wondering if I had to go on and do a residual plot for the linear relationship to see if this is the best line to estimate from.

3. (a) Using the y(estimate)= y(mean) + ( r * (Sx/Sy) (x-x(mean)), I got y(estimate)=183.8cm
(b) To get the same height as estimation, I let the y(estimate)=x. I then solved for x and got x=183.6cm for the sons estimated height and the fathers actual height to be the same.
(c) I said that correlation to the mean (i.e. regression or correlation) doesn't imply or explain causality, only variability of change of one variable with respect to another variable. Thus, the sons height won't be solely effected by their Father's height.

4. I used this convention: C+= people classified as hypertensive, C-= people classified as normotensive, H+= people who are actually hypertensive, H-= people who are actually normotensive. I used a table to display the data:

H+ H-
C+ 19 16 35
C- 1 64 65
20 80 1.00

Thus, I got the proportion that someone is hypertensive, given they're hypertensive is 19/35,(0.543).

5. This one was a real doozy, especially when it came to interpreting the data. I just interpreted the data as the No.( Father getting the disease ) as 10% and the No.( Mother getting the disease ) as 10%.
(a) I set up the distribution as follows:
A= Mother Gets Disease, A*= Mother doesn't get disease, B= Father gets disease, B*= Father doesn't get disease

A A*
B 0.02 0.08 0.10
B* 0.08 0.82 0.90
0.10 0.90 1.00
(b) I got the probability for this using Pr(AB)/Pr(B), which got 0.20
(c) from the table, Pr(A*B*)= 0.82

Any help is greatly appreciated! Oh yeah, sorry about the dodgy KV maps for questions (4) and (5)(a), no matter how I manipulate it, the characters just go straight to the margin. And as a future reference, where would I post homework questions on Statistics? Thanks guys
 
Physics news on Phys.org


No one?? I knew this was a bullsh*t assignment!
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K
Replies
6
Views
1K
  • · Replies 24 ·
Replies
24
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
7K
  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 30 ·
2
Replies
30
Views
4K