What is Regression: Definition and 359 Discussions

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome variable') and one or more independent variables (often called 'predictors', 'covariates', or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane). For specific mathematical reasons (see linear regression), this allows the researcher to estimate the conditional expectation (or population average value) of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters (e.g., quantile regression or Necessary Condition Analysis) or estimate the conditional expectation across a broader collection of non-linear models (e.g., nonparametric regression).
Regression analysis is primarily used for two conceptually distinct purposes. First, regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Second, in some situations regression analysis can be used to infer causal relationships between the independent and dependent variables. Importantly, regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset. To use regressions for prediction or to infer causal relationships, respectively, a researcher must carefully justify why existing relationships have predictive power for a new context or why a relationship between two variables has a causal interpretation. The latter is especially important when researchers hope to estimate causal relationships using observational data.

View More On Wikipedia.org
  1. K

    Examples of Multiple Linear Regression Models

    1) "Simple linear regression model: Y = β0 + β1X + ε E(Y) = β0 + β1X A linear model means that it is linear in β's, and not necessarily a linear function of X. The independent variable X could be W2 or ln(W), and so on, for some other independent variable W." I have some trouble...
  2. K

    What is the difference between random error and residual?

    1) "Simple linear regression model: Yi = β0 + β1Xi + εi , i=1,...,n where n is the number of data points, εi is random error We want to estimate β0 and β1 based on our observed data. The estimates of β0 and β1 are denoted by b0 and b1, respectively." I don't understand the difference...
  3. K

    What is the difference between E(Y|X) and E(Y|X=x) in linear regression models?

    1) "In regression models, there are two types of variables: X = independent variable Y = dependent variable Y is modeled as random. X is sometimes modeled as random and sometimes it has fixed value for each observation." I don't understand the meaning of the last line. When is X random...
  4. M

    Assumptions behind the OLS regression model?

    Hi, In many statistics textbooks I read the following text: “A models based on ordinary linear regression equation models Y, the dependent variable, as a normal random variable, whose mean is linear function of the predictors, b0 + b1*X1 + ... , and whose variance is constant. While...
  5. S

    Multiple regression and Time Series

    Hi, I'm in a college statistics course where I'm doing an assignement with Minitab. I have a one month time series of electricity use (hour intervals). I've attempted to remove the season effect from weekdays by index multiplication so all I'm left with (hopefully) is the effect from...
  6. B

    [Data regression] Levenberg-Marquardt BUT force to intersect 2 KNWON points

    Hi, I have a large data set (2D Coordinates with errors) and i am using the Levenberg-Marquardt method to estimate the best polynomial function. That part is working fine. Now in my data set are exactly two KNOWN data points that are 100% correct. Therefore I want my function to go...
  7. S

    Least Squares Regression Analysis - No Idea

    Hello, I am a first year undergraduate university student majoring in Engineering and Computing Sc. One of my courses is Linear Algebra. We have been given an assignment in which question no. 2 is out of syllabus. It is on Least Squares Regression Analysis. This has not been taught to us. We...
  8. A

    Determining the best fit regression for a set of data

    determining the "best fit" regression for a set of data Is there a test one can perform to quickly determine what type of regression (linear vs. non-linear) will best fit the relationship between two variables? i.e. How can one quickly determine the most probably relationship between two...
  9. O

    How Do You Optimize a and b in Linear Regression for Minimal Deviation?

    Homework Statement I've been given a set of data x 0 0.5 0.7 1.5 1.75 y 0.5 0.72 0.51 1.5 1.63 Given y=ax+b for this data points of linear model, I have to 1. minimize the sum of the absolute values of deviations between experimental value of Y and value predicted by the...
  10. C

    Finding equation for regression curve

    Homework Statement I want to know how to find equation for the interpolated function of any degree. On Mathematica, for example. The Attempt at a Solution Unfortunately, on Excel, the plots are deceiving because the points on the plot are actually out of range. For example, this...
  11. F

    Regression analysis - case of multicollinearity

    What are some of the elementary remedial procedures to multicollinearity (VIF >= 10) in linear regression? We were told to simply just drop that particular independent variable, but someone else suggested we could center the predictor variables (ie., xi = Xi - Xbar). Can somebody explain why...
  12. E

    Two data sets and want to do a regression excel y = C(x^n)

    I have two data sets and want to do a regression so that the equation that relates them is of the form y = C(x^n), where C and n are constants. How do I do this in Excel?
  13. M

    Regression Project: Analyzing Real Life Data with Bell Curve Challenges

    I have a regression project where I have to pick a real life situation data and answer questions. here is my data: 18 data points were entered: 4.00 4.20 4.50 4.60 4.60 4.70 4.90 5.10 5.40 5.50 5.60 5.60 5.80 6.00 6.10 6.80 6.90 7.50 Mean = 5.43 95% confidence interval for actual...
  14. R

    Deriving Regression Coefficients

    Hi Sorry this may be an obvious one...can anyone help me with getting from the first to the second equation below? I'm particularly stuck with manipulating the terms inside the summations formulas. I can derive to here: \sum_{i=1}^{N}x_iy_i - N\bar{x}\bar{y} - \left(...
  15. S

    How to use Regression In real life

    Hi, I had made an inquiry about regression last week and am still working on understanding the whole concept.This week my question is different. How would I use the a and b values of the regression to indicate how something is growing. Like I'm having problems interpreting data because I do not...
  16. T

    Linear Regression: Expected Value & Variance of Predicted Values

    Homework Statement Consider model of linear regression: Y_i = \beta_0 + x_i \beta_1 + \epsilon_i i = 1, ..., 5, where \epsilon_i \sim \mathcal{N}(0, \sigma^2) are independent. Find expected value and variance of predicted values \widehat{Y}_i considering that observations are...
  17. S

    Actually, is there a relationship in regression when changing values?

    Hello, I've just started learning about regression. While working on some problems I noticed that one problem set had exactly the x,y values swapped. And as you know when you switch the points that's normally an inverse, but the regression doesn't show that pattern as the regression values are...
  18. D

    Regression Analysis: Most Sophisticated Methods & Least Squares

    What are the most sophisticated methods of performing regression analysis and how does least squares rank among them? Additionally which category would the least squares method fit into below (if any): Simple, Multiple, Non-linear, Robust, Ridge, Logistic Thanks, -Diffy
  19. M

    Finding Intercept in R with lm() for Specified Slope

    I'm interested in fitting a line to some data. There is a built-in function in R lm() that gives me both the best-fit slope and intercept, however, I would like to determine the best fit intercept GIVEN a specified value of the slope. Is there an easy way to do this? I apologize if this is in...
  20. D

    Least-squares estimation of linear regression coefficients

    AFAIK, there are two basic type of linear regression: y=ax+b and y=a2 + bx + c But I have to do the same with the function y = asin(x)+bcos(x). Here is what I have done: We have: \begin{array}{l} \frac{{\partial L}}{{\partial a}} = 0 \frac{{\partial L}}{{\partial b}} = 0Continue...
  21. K

    Nonlinear Regression: Getting Started with X to Predict Y

    Ok, so I am trying to find an equation to match a 2D data-set (x,y) positions. I have X and I want to use an equation to predict Y to a rather accurate degree. As far as I can understand, I need to use some form of regression (non-linear, since the data is awfully curved). Now, I have no...
  22. J

    Q on Satellite Orbits + Regression of Nodes

    Currently I am studying about mechanics of satellite orbits, however I cannot seem to understand in the regression of nodes, why the line of nodes would rotate in a direction contrary to that of the actual satellite orbit (irrespective if it is prograde or retrograde orbit) ? Ive tried...
  23. C

    Exponential regression math 30 pure

    Homework Statement Determine the inverse of the exponential regression equation that you foun in the first bullet. Homework Equations y=ab^x The Attempt at a Solution in the first bullet i got the equation y= (8.7166)(0.93240)^x then i found using logs that x = -5...
  24. A

    Help Me Find Df & k Values Using Non-Linear Regression

    Dear everybody, I wave an equation with datas and want to regress in non linear form with plot. Would you help me to solve the equation bellow by any softwares, please? The data, the equation and parameters are given bellow. I want to fit the data to non-linear equation to find Df and k...
  25. A

    Optimizing Regression Degree with Weighted Cost Function

    Hello, all. I know what I want, but I just don't know what it's called. This has to do with regression (polynomial fits). Given a set of N (x,y) points, we can compute a regression of degree K. For example, we could have a hundred (x,y) points and compute a linear regression (degree 1). Of...
  26. L

    Why Do We Square Errors in Least Squares Regression?

    You must have used it couple of times while solving an engineering problem. For example in line fitting, why do we have to square? Can't we just pass the line thru the max number of points. Can someone explain. Thanks in advance.
  27. S

    Quadratic Regression on HP 50g

    Just went against the tide of the school and bought an Hp 50g Calculator(love RPN!) however to my dismay the calculator does not appear to do quadratic regression. Now my teacher is pressuring me into using a ti and I can't stand them now that I've gotten used to RPN. Does anyone know how to...
  28. D

    Multivariate Linear Regression With Coefficient Constraint

    [SOLVED] Multivariate Linear Regression With Coefficient Constraint I'm attempting a multivariate linear regression (mvlr) by method of least squares. Basically, I'm solving a matrix of the following form for \beta_p, $ \begin{bmatrix} \sum y \\ \sum x_1 y \\ \sum x_2 y \\ \sum x_3 y...
  29. C

    Regression Analysis for a Gamma function

    [SOLVED] Regression Analysis for a Gamma function My regression analysis program that I developed in BASICS back in the 1980's applies for half a dozen linear equations some of which are transormed into log forms. I would like to modify my program to include this Gamma function...
  30. X

    Excluding data for a linear regression in OpenOffice

    For my chemistry lab, in order to computer change in temperature for a calorimetry experiment we're suggested to take the line of best fit from the peak temperature onwards (excluding the initial data) and extrapolate to y = 0. For example, here's some of the data I gathered: A, Trial 2...
  31. G

    Normal assumption with least squares regression

    My google search just turns up results telling me that one of the assumptions I have to make is that each Y is normal. My question is why do I have to assume its normal. Why does it follow that it has to be normal as opposed to some other distribution? Hope that makes sense. Edit: I thought...
  32. S

    Estimating error in slope of a regression line

    OK, I have a question I have no idea how to answer (and all my awful undergrad stats books are useless on the matter). Say I make a number of pairs of measurements (x,y). I plot the data, and it looks strongly positively correlated. I do a linear regression and get an equation for a line of best...
  33. J

    Free fall, linear regression

    1) To test the quality of a tennis ball you drop it onto the floor from a hieght of 4 m. it rebounds to a hieght of 2 m. if the ball is in contact with the floor for 12 ms, what is the magnitude of its average acceleration during contact and is the average acceleration up or down. What i did...
  34. K

    Sports Illustrated Jinx : Regression to the Mean

    "Sports Illustrated Jinx": Regression to the Mean GENERAL BACKGROUND A few weeks ago, my uncles and others were discussing the so-called "Sports Illustrated Jinx", "Sophomore Jinx", and "Heisman Jinx". Statisticians have said that the Sports Illustrated Jinx, in particular, is not a jinx at...
  35. A

    Cox Regression Help: Finding Data for Analysis

    First of all, I'm not sure if this is the best place to ask this question so if it isn't...sorry. I'm doing a piece of coursework on cox regression, but I'm having trouble finding any data to use. Most of the data I've seen just gives percentages of people in certain groups, which isn't...
  36. S

    Can a Function Accurately Model a Random Array of Points on the x-y Plane?

    Hi, I want to model a set of a few dozen points on the x-y plane where y can be anywhere from 0 to 100 and x increases by 1 for each point on the y-axis, ex: (1, 26) (2, 84) (3, 2) etc. . . Is it possible to accurately model such a random array of points with an equation? Someone...
  37. D

    Linear regression, including Uncertainties

    My problem in short: I have a set of data, and I want to calculate the linear regression, and the uncertainty of the slope of the linear regression line, based on the uncertainties of the variables My problem in detail: My data is from an experiment and the uncertainties (errors) are...
  38. A

    Simple Linear Regression Problem

    Hi, I've got what should be a very easy simple linear regression problem, but I can't seem to be able to get my head around it. Here it is: So far I've been trying to sub these values into a regression equation like this one: Y = 5B + (-0.003)B^2 Where "B" is my Beta1 value. I...
  39. J

    Looking for Book on Regression

    Hi, I hope this is the correct section, I am looking for a good starting book about Parametric Regression and Regression Diagnostics. I am currently slogging through "Semi-Parametric regression" Ruppert et-al (2003) and I'd like a slightly easier book that goes slightly more slowly and...
  40. J

    Non-linear regression of Curie's Law

    Hi, I've collected some data on the relative permeability of ferrite at various temperatures, subject to a constant external magnetic field and I'd like to fit a curve to the data. I believe that stat-mech theory predicts that \mu_r = 1 + a\tanh(b/T) where T is thermodynamic temperature...
  41. O

    Has Information Technology Really Regressed Over the Years?

    Information Technology is actually a history of regression. I started programming in Basic on Commodores and PCs in the early 80s. It was fast to learn and fast to put ideas into practice. In fact I would argue that 99 % of all IT programming problems were already well solved just using some...
  42. D

    How to test forecasting accuracy of regression model?

    Hey, I have just finished running all the regressions for my thesis and I now have a nice 8-variable regression with an r² if 0.53. Almost all my hypothezised variables are significant. I am now wodering if there is a possibility to somehow test the forecasting accuracy of this model? I am using...
  43. F

    What is Causing the Exponential Curve in My Velocity Squared vs. Radius Graph?

    Hey-- I'm writing up a physics lab report on centripetal force; at the moment I've hit a problem with the velocity squared vs. radius graph. The graph *should* show a root curve (v^2 = Fr/m) but all of the regression utilities I've used churn out an exponential curve. Here are the four points...
  44. P

    Exponential regression help and more

    Hi I'm doing some homework and I have the main question answered but this added one is hurting me. Let me give you the entirity of it... - For each company ABC and XYZ write an exponetial regression equation in the form of a X b^x (Thats A times B to the x), where x = number of yeats and why =...
  45. S

    Linear regression where am i going wrong?

    linear regression where am i going wrong?? Linear regression using least square fit method for the determination of cocaine sample Cocaine (mg/ml) Peak height X= 2.75 Y=27377 X squared=0.9625 X x Y=3241.272 M = 10 x 3241.272 – 2.75 x 27377 / 10 x 0.9625 – 7.5625 =...
  46. matthyaouw

    Linear regression and varience.

    Im having some trouble with this, and I was hoping someone could help me. I have a data set from which I've determined the \widehat{a} and \widehat{b} values and determined where the line of best fit should go using linear regression. The next thing I have to do is work out the varience using...
  47. H

    Linear Regression, Linear Least Squares, Least Squares, Non-linear Least Squares

    It seems to me that Linear Regression and Linear Least Squares are often used interchangeably, but I believe there to be subtle differences between the two. From what I can tell (for simplicity let's assume the uncertainity is in y only), Linear Regression refers to the general case of fitting...
  48. L

    Principal Component Analysis vs Factor Analysis vs regression

    I just gave a try to a statistical excel add-in and found PCA and FA quite interresting. However, I don't see where are the differences between these two analysis, except for the layout of the results. Additionaly, I see the link with multiple regression, but I don't see the link precisely...
  49. O

    But how accurate is my regression function?

    Hey. I tried to make a function that could calculate the function of a "average" line. See my picture: http://home1.stofanet.dk/orhan/regr.jpg The language is danish but if you know about regression you will understand the variables. What have I done wrong? or.han
  50. A

    Least squares regression problem

    Hi, I am having some difficulty with this problem: what would be Y^h^a^t if s_y_/_x = 439, n = 24 and 95% confidence interval estimate for the average Y given a particular value of X is 1125 and 1695. ----------------- I know Y^h^a^t = b_o + b_1x but I am not sure how I can use the...
Back
Top