What is Regression: Definition and 359 Discussions

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome variable') and one or more independent variables (often called 'predictors', 'covariates', or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane). For specific mathematical reasons (see linear regression), this allows the researcher to estimate the conditional expectation (or population average value) of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters (e.g., quantile regression or Necessary Condition Analysis) or estimate the conditional expectation across a broader collection of non-linear models (e.g., nonparametric regression).
Regression analysis is primarily used for two conceptually distinct purposes. First, regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Second, in some situations regression analysis can be used to infer causal relationships between the independent and dependent variables. Importantly, regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset. To use regressions for prediction or to infer causal relationships, respectively, a researcher must carefully justify why existing relationships have predictive power for a new context or why a relationship between two variables has a causal interpretation. The latter is especially important when researchers hope to estimate causal relationships using observational data.

View More On Wikipedia.org
  1. M

    Minimization and least squares/ridge regression

    Homework Statement f(x;a) = x_o + (a_1,a_2,a_3,...a_d)*x min a (Xa - Y)^t o^(-1) (Xa - Y) a = (a_0 a_1 a_2 a_3 a_4 . . . a_d)^t Homework Equations Y = (y_1 y_2 y_ 3 ... y_k) X = Dsign Matrix The Attempt at a Solution to minimize write (X(a+ (delta a) - Y )^t o^(-1) (X (a+ delta a) - Y)...
  2. Z

    Ridge Regression Minimization Proof

    Homework Statement Linear family: [tex]f(x;a) = a_{o} + (a_{1}.a_{2},a_{3},...,a_{k}) \cdot x[\tex] [tex] (Xa - Y)^t \sigma^{-1} (Xa-Y) + \lambda (a^t a-a^2_{o} [\tex] [tex] a = (X^t \sigma^{-1} X + \lamda I_{o})^{-1} x^t \sigma^{-1} Y [\tex] Homework Equations[tex] Y_{i} = f(x_{i}) +...
  3. B

    Regression analysis sample size problem

    Hi there, Could anybody offer any advice on a linear regression sample size problem? I am using regression to predict the energy consumption (watt/mile) of an electric car based on a number of parameters such as average velocity, max velocity, average acceleration, the number of stops...
  4. S

    Statistics question: error of slope in linear regression from r

    A text says that if you calculate the linear regression of data points and you get the equation y=mx+b with an r2 value, the error in the slope is given by: δm/m=2(1-r) No explanation was given. Could someone please explain this formula? Thanks!
  5. J

    Logistic Regression Cost Function

    Hi, I am studying logistic regression and gradient ascent and have seen it used with a cost function and without one. Could anyone tell me why you would use a cost function? It seems just as effective without one. alpha = .05 h = data * weights error = labels - sigmoid(h)...
  6. L

    One question regarding to the simple regression

    there are 3 variables. X,Y,Z. Assuming Y = a + b*X + errorTerm Z = c + d*X + errorTerm, Y = g + h*Z + errorTerm; what can we say about the relationtion between h and (b,d)? under what condition so we can have "h = b/d"? thanks
  7. T

    Estimating Parameters in Multivariate Regression

    Homework Statement The Attempt at a Solution So I was wondering whether or not, in an instance of n observations and k explanatory variables, where the following is an accurate statement: That is, the estimate of beta_1 found by only regressing y on x_1 is equal to the the true multiple...
  8. A

    Regression Analysis: Does it Make Sense?

    Does this even make sense? Am told to do a multiple regression analysis. The response variable and the explanatory variables add up and should give up ~100 percent of the total product. Example: Milk = water + fat + protein ~= 100% (all are in terms of percentages) The regression I was...
  9. K

    How can I improve regression results by adjusting the goodness of fit metric?

    I'm doing some line fitting on experimental data. Basically I have some array of pixels, and a value measured at each pixel, and I am fitting it with several constrained Gaussians. I'm using a Levenburg-Marquadt nonlinear least squares algorithm called mpfit to fit the parameters, but the...
  10. C

    Effortlessly Linearize y(x)= a(1-e-bx) with Expert Help

    Hello! how I linearize this function? y(x)= a(1-e-bx) a and b are constants
  11. S

    Linear Regression β: Estimating η with MLEs

    βHomework Statement Data y1,y2...yn are modeled as observations of random variables Y1,..Yn given by Yi = α + β(xi-xbar) + σεi Where α , β and σ are unknown parameters x1,x2...xn are known constants and xbar is (1/n)Ʃxi and εi's are independent random variables each with the...
  12. GreenGoblin

    MHB Regression least squares (boring question? or more to it)

    yi = a + bxi + ei is the simple liner regression model as per is usual "state the assumptions on the errors ei to justify a least squares fit" ? So is this just that E(ei)=0, i can't see what else is a 'must' for this? what about that they are normally distrubited? i know the properties of the...
  13. T

    Physics Experiment Linear Regression Issue

    Homework Statement Hello, I have done a laboratorial experiment (electron diffraction) and I've been doing the analysis of the obtained data. I have plotted the data obtained experimentally, and the slope of the obtained linear regression should give a certain value. What's happening...
  14. D

    MHB Testing a Linear Stepwise Regression Model - Need Advice!

    Hi folks. Just looking for some input please. I have a dataset containing interval data (one dependent and 6 independent variables) and taken a random 90% sample (approx 300 observations). I've performed a linear stepwise regression on the 90%, in order to obtain a model to predict the...
  15. ElijahRockers

    Least Squares Regression

    Homework Statement http://www.math.tamu.edu/~vargo/courses/251/HW7.pdf Given a set of points (xi,yi) and assuming f(xi) is linear, the deviation measured is F(m,b)=\sum_{i}^{n}(y_i - f(x_i))^2. There are a few different questions about this from the link above. The Attempt at a...
  16. B

    Urgent Response Needed: Can Partial Means Predict a Regression Line?

    urgent reply needed Can we use partial means as a predictor for the response while fitting a regression line or curve?.
  17. H

    Regression of linear combination better than just regression

    Hi, I have a problem that is giving me a headache. I have measured two angles that I believe to be related to one another, and they are (this is a data set where I have measured the angle from a datum to two features on a bone. There are 14 bones in the data set): Angles to feature 1 (F1)...
  18. J

    Question about propagation error and linear regression?

    I have couple questions about this and I was hoping someone with some stats knowledge could clarify. First, when people report numbers such as 10 plus or minus 5, what does the 5 mean? Is it the standard deviation or the confidence interval or the variance? What is the relationship between...
  19. H

    Regression (I think) of Newton's Law of Cooling

    Homework Statement Using a data logger, I have collected data for two cooling cups: the temperature (c) at 1 second intervals. My task was to model this data using two methods. "METHOD 1: Use EXCEL or the regression analysis capability of your graphic calculator" "METHOD 2: Find the...
  20. D

    Intermediate Physics Lab Analysis, Uncertainty and Linear Regression

    Homework Statement "You are asked to do an experiment where you will need to use a rotating blade to measure the wind speed. You measure the number of rotations of the blade at 10 different wind speeds, 10 times each and will make a linear fit to determine the wind speed as a function of...
  21. A

    Parameter Regression: Modeling Explained

    What is a parameter regression in modeling?
  22. M

    What Is the Help Variable in Swedish Regression Analysis?

    Homework Statement Ive two questions not directly relating to a problem but just stat in general. i) In Swedish what is called the "help variable" for a regression equation: y=u'*B where u' is the vector of explaining variables (x1,...,xn) can apparently be written in many ways.. One...
  23. N

    Verifying Multiple Regression Calculation

    It's been a long time since I've done (least squares?) multiple regression, is the calculation below correct? y data points 6,054 7,200 6,243 5,536 4,879 y hat = 5,984 x data points 414 351 425 372 328 x hat = 378 Sum of xy's residuals = 39,643 Sum of (x-x hat) squared =...
  24. S

    Linear regression and maximum likelihood estimates

    Homework Statement Suppose that data (x1,y1),(x2,y2),.?.,(xn,yn) is modeled with xi being non random and Yi being observed values of random variables Y1,Y2,...Yn which are given by Yi = a + b(xi-xbar) + σεi Where a, b, σ are unknown parameters and εi are independent random variables each...
  25. F

    Medical Cuffles blood pressure measurement regression equation

    Has anyone tried to apply the following regression equation appeared in the paper "Continuous measurement of systolic blood pressure using the PTT and other parameters" (Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference Shanghai, China, September 1-4, 2005)...
  26. T

    Regression - AIC/SBC Comparison

    I'm not sure if this is the right place for this question, but it was on the comparison between different model's AIC/SBC values. I ran a linear regression and got an AIC/SBC of .743/.768. When I ran the same regression in log-linear form I ended up with an AIC/SBC of -7.559/-7.534. My...
  27. B

    Linear Regression: Pros and cons of Normal vs. simplified methods?

    I'm currently looking at a linear regression handout from Uni and there are two methods to calculate the equation. The Normal one is to find a and b for y=a+bx, the equations for a and b are given in the handout but I'll assume you're familiar with them. The simplified one is using y = Bx +...
  28. M

    Question about repeated measures Anova and multiple regression

    This is for my coursework. I have two problems. The first one. Homework Statement I have two time messures. Before and after a scheme was introduced.I have to answer a question if it made a statistically significant difference or not. And also as it was introduced in 3 places I need to...
  29. C

    Question about creating a regression model

    Hey, so I've started doing a plasma physics research project and one of the things that I have to do is design a function which approximates a curve based on data points that its fed. So far I found the formula for creating a linear regression, but I'm having trouble finding the formulas for...
  30. M

    Appropriateness of Constrained Segmented Univariate Polynomial Regression Model

    Hi all, I've learned that in unconstrained polynomial regression, the optimal order can be determined using two F tests : one to test for the significance of the overall regression, the other to test for the significance of the higher coefficients (assuming the first test passed of course)...
  31. C

    Regression help (basic statistics)

    Homework Statement Hello, I have some data from a statewide standardized exam, and I am trying to do a regression model, but am having a bit of trouble. (Im a mathematician and not a statistician). Basically, I am trying to show some type of correlation between race and test scores...
  32. W

    Linear Regression of estimated measures / outliers

    Hi all, I would like to understand the theory for determining outliers in the following scenario. Let's say I am to fit a linear model to the data of house size v. sale price for a particular location. And let's say I have a fairly good linear relationship, as house size increases, so does...
  33. D

    Regression Analysis Homework: Best Model & Appropriateness

    Homework Statement based on this data http://www.stat.ufl.edu/~rrandles/sta4210/4210lectures/secondexreview/exam2rev.pdf 1) Consider the full (three predictor) model. Is this model useful? (are any of the predictors worthwhile?) 2) Use the All-Subsets and conduct a search for the best...
  34. D

    Regression Analysis Homework: X1, X3 & Locadv

    Homework Statement based on this data http://www.stat.ufl.edu/~rrandles/sta4210/4210lectures/secondexreview/exam2rev.pdf 1) Consider the full (three predictor) model. Is this model useful? (are any of the predictors worthwhile?) 2) Use the All-Subsets and conduct a search for the...
  35. A

    Becoming Familiar with Regression Notation and Terminology

    Hi there. I am having some trouble understanding the full context of this question. Suppose we have a categorical variable T E (1...n) and we observe k observations for Y when T = n. If a regression model holds: i) Write down Y in terms of dummy variables X1...Xi ii) What is the...
  36. R

    Bivariate regression, dummy variables and r^2

    Hi all. I am trying to find out if it's possible to calculate an r^2 value (% of variation explained) when performing a linear bivariate regression using dummy variables. Let me provide an example of what I'm working on. I'm trying to find if a correlation exists between the type of house...
  37. J

    How to Find the Variance of O hat1 in a Multiple Regression Model?

    Consider the multiple regression model containing three independent variables y = B0 + B1x1 + B 2x2 + B 3x3 + u You are interested in estimating the sum of the parameters on x1 and x2; call this O1 = B1 + B 2 a) Show that O hat1 = B hat 1 + B hat 2 is an unbiased estimator of O1. b)...
  38. J

    Multiple regression analysis, econometrics, and statistics

    I am sooo lost in this class, please help. 1. Let the true (population) model be y = B0+B1x1+B2x2+u where u is an unobserved error term with u (conditional) x1, x2 and N(0, sigma^2). Hence, u is normally distributed with mean 0 and variance sigma^2 (i.e., E[u (conditional) x1, x2] = 0 and V...
  39. C

    Linear regression and bivariate normal, is there a relationship?

    Hi everyone, This is not a homework question. I just want to understand an aspect of linear regression better. The book "Applied Linear Models" by Kutchner et al, states that a linear regression model is of the form Y_i = B_0 + B_1 X_i + \epsilon_i where Y_i is the value of the...
  40. P

    Cubic Regression: Exponential Growth & Leveling Off

    Hi, I have the following population figures for a five year interval: 554.8, 609, 657.5, 729.2, 830.7, 927.8, 998.9, 1070, 1155.3, 1220.5 The graph has an exponential growth from the first value to the fourth value and then the population starts to decay. I found that a Cubic...
  41. S

    Analytical linear regression: is it possible?

    I've been told that their exists no perfect mathematical method of obtaining a line of best fit from a population of data. This doesn't make a whole lot of sense to me, so I have made an attempt at doing such (see google docs link)...
  42. A

    Find k in y=kx: linear regression or just average of (y/x)?

    Dear all, let's say I want to know the elasticity constant of a spring (k), so I measure several times different values for the force applied to the spring, F, and the displacement of the spring, x. So, for N measures, I have xi and Fi and their uncertainties. Now, I'm really not an expert of...
  43. C

    Statistics, multiple choice, regression equation - 2nd question

    Statistics, multiple choice, regression equation -- 2nd question Homework Statement A regression equation was developed to predict gasoline mileage (mpg) for various car weights (pounds). The resultant equation was: Y = 35.2 - .0034 X. Which two answers can be concluded? A. The...
  44. C

    Statistics help - Scatter plot, regression

    Statistics help -- Scatter plot, regression Homework Statement Consider the following data set for ten first grade students; the variables are the number of minutes spent learning a list of spelling words and the number wrong on the spelling test. [15 points total] (supposed to be a...
  45. A

    Multiple Polinomial Regression

    Hi friends, I need to get the surface equation like function f(x,y) that passes through 66 points. I attached the file xls where is the points, and a pdf file where is the graphics of the points. Can somebody help say me that I have to do for solve it?
  46. R

    Financial Model Backtesting and Regression Analysis?

    Hello everyone, I have a homework assignment in my financial mathematics class and I don't fully understand it, so here is my problem: I am supposed to backtest a given data set to see if a financial model works, in particular, for 30 maturity dates of the treasuries (bonds) I had to see how...
  47. J

    Regression of x on y is 4x-3y=8

    Homework Statement given the regression of y on x is 2y-3x=10 and the regression of x on y is 4x-3y=8 find the correlation r between x and y, also find the mean of x and the mean of y. Homework Equations The Attempt at a Solution I am absolutley clueless on this problem, there is...
  48. Femme_physics

    Trying to translate a mathematical term from Hebrew to English ( regression rule )

    Trying to translate a mathematical term from Hebrew to English ("regression rule") If I translate it word by word in Hebrew, it's the "regression rule" Such as, I am told that "a series is defined for every natural n by the regression rule"...
  49. H

    Multiple Linear Regression Analysis

    Hi, I've asked this question on another forum, but no response until now. Maybe I will have a little bit of luck here. So .. I have a problem. I have a set of 8 parameter and I use this parameters in order to compute a measure (I vary each parameter with a step of 50%). I would like to know...
  50. F

    Linear regression, error in both variables

    Hi y'all, wondering if you could help me with this. I have a data set with a linear relationship between the independent and dependent variables. Both the depended and independent variables have error due to measurement and this error is not constant. For example, {x1, x2, x3, x4, x5}...
Back
Top