- #1

fog37

- 1,568

- 108

I have a question about linear regression models and correlation. My understanding is that our finite set of data ##(x,y)## represents a random sample from a much larger population. Each pair is an observation in the sample.

We find, using OLS, the best fit line and its coefficients and run some statistical tests (t-test and F-test) to check the coefficients' statistical significance. The ultimate goal is to estimate with precision the population slope and intercept.

Does each pair ##(x,y)## represent the realization of a bivariate random variable ##Z=(X,Y)## with Gaussian joint distribution? In the regression analysis, are both ##X## and ##Y## random variables or only the variable ##Y## is random? A random variable has its possible values and associated probabilities. Two random variables ##X## and ##Y## are said to be jointly normal if ##aX+bY## has a normal distribution.

That said, how to we get to the linear model ## y =\beta_1 x +\beta_0## considering ##X## and ##Y## as both random variables?

Thank you!