What is Regression: Definition and 359 Discussions

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome variable') and one or more independent variables (often called 'predictors', 'covariates', or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane). For specific mathematical reasons (see linear regression), this allows the researcher to estimate the conditional expectation (or population average value) of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters (e.g., quantile regression or Necessary Condition Analysis) or estimate the conditional expectation across a broader collection of non-linear models (e.g., nonparametric regression).
Regression analysis is primarily used for two conceptually distinct purposes. First, regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Second, in some situations regression analysis can be used to infer causal relationships between the independent and dependent variables. Importantly, regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset. To use regressions for prediction or to infer causal relationships, respectively, a researcher must carefully justify why existing relationships have predictive power for a new context or why a relationship between two variables has a causal interpretation. The latter is especially important when researchers hope to estimate causal relationships using observational data.

View More On Wikipedia.org
  1. T

    Multiple Regression in Excel for Mac

    Hi, I was wondering if anyone familiar with excel on a mac can help me? Am i able to do polynomial regression with different x's who contain different amounts of elements? I have three columns, one contains 252 elements, another contains 53, and the last contains 12. They are all closing...
  2. H

    Exponential regression of data close to one

    Hi, I've been working on trying to model sales for my work and I'm really just modeling it using exponential regression, so I get the linear regression of the logarithm of the data and obtain the desired formula. What I'm confused about is that if I integrate this to try to predict how many...
  3. C

    Linear Regression Error on Excel

    This is for an experimental physics homework,I am using the latest version of MS Excel. I have a set of data, I perform linear regression on them and it gives me a line y=ax + b. Given that both a and b have physical significance I would like to know how could I know the uncertainty...
  4. R

    Comparing curves using gaussian process regression

    Hi guys, I have run multiple simulations on networks that are all slightly perturbed from each other. They produce slightly different curve outputs onto an x-y graph which I need to now analyse (it has been about 5 years since I did statistical analysis hence why I am here). A couple of the...
  5. ╔(σ_σ)╝

    Using correlation coefficients as x in a regression?

    Using correlation coefficients as x in a regression?? I was reading an article in the Wall street journal and the author was using a rolling correlation coefficient, on a set of variables, as his predictor variable in a linear regression. Basically it was a uni-variate linear regression , y=...
  6. S

    Regression Analysis on Theoretical Model

    Hi everyone. I'm a graduate student and am struggling with something that may possibly be trivial. So, my research is creating a mathematical model to represent a real system. I have data points from my real system that I want to compare my model to. How do I do a regression analysis and get...
  7. V

    Uncertanty in a non-linear regression with least squares method

    Homework Statement Ok, so I'm trying to fit a set of data (21000 points to be exact) to a sine function. Homework Equations Y = A*sin(ωt) The Attempt at a Solution I used NumPy to get the parameters A and ω with the least squares method. So far, so good. However, i appear to...
  8. R

    Finding the Uncertainty of the Slope Parameter of a Liner Regression

    Finding the Uncertainty of the Slope Parameter of a Linear Regression Suppose I have measurements x_i \pm \sigma_{xi} and y_i \pm \sigma_{yi} where \sigma is the uncertainty in the measurement. If I use a linear regression to estimate the value of b in y=a+bx, I'm struggling to find a...
  9. F

    Regression Analysis - Constructing a Model ((Need ))

    Hello, I am trying to construct a general function/method based on two sets of minimum/maximum data point constraints, which can take on new values in different situations. The only known data for this general function is the starting point (y-axis intercept) and the x-range. The rate of...
  10. X

    When y is negative in linear regression?

    I am using linear regression to predict 'y' based on 8 variables. With my example, most the Betas that I got are negative. So, y, the value to predict, is negative. To my data, y is a time in seconds, so I think it shouldn't be negative. I my example in python, and I want to know if y...
  11. P

    Testing regression model with F-test

    Hey guys. I have some trouble understanding how the F-test is used for testing the viability of a regression model. Before I delve into the background/question, just wanted to post a link that discusses the topic briefly: http://www.stat.yale.edu/Courses/1997-98/101/anovareg.htm So, coming...
  12. P

    Logarithmic Regression By Hand

    I'm trying to write some code to do a regression on data weight (x) and time (y). As best as I can tell, the model should be y = b1 + b2ln(x), but I don't know how you can do this by hand (I know how to in R...). I also know how to do a simple linear regression by hand. Can it be done using...
  13. B

    Linear vs. Cubic Regression: Training and Test RSS Comparison

    Homework Statement I collect a set of data (n = 100 observations) containing a single predictor and a quantitative response. I then fit a linear regression model to the data, as well as a separate cubic regression. 1) Suppose that the true relationship between X and Y is linear. Consider...
  14. P

    Multiple Linear Regression - Hypothesis Testing

    Homework Statement I'm looking through some example problems that my professor posted and this bit doesn't make sense How do you come up with the values underlined? Homework Equations The Attempt at a Solution Upon researching it, I find that you should use α/2 for both...
  15. M

    Estimating measurement error using error from linear regression

    Sorry if I'm in the wrong subforum. This is a rather simple and straightforward question, I hope. I'm doing a measurement that requires me to do a linear regression on data points to get a value of the slope. The slope is the value of the actual property that I am measuring. Assuming...
  16. X

    Linear regression with the same X value

    In a linear regression with 1 independent variable, if X is always the same (let's call I am unlucky), but Y present different values for the same X, I still can find the coefficient of the straight line equation?
  17. X

    Find the error in a linear regression

    Hi, I am trying to understand how I find the error in linear regression, and what to do with it. I am using linear regression to predict the time of execution based on the size of the input and the number of tasks used in the computer to get the result. 1 - In a linear regression, I calculate...
  18. X

    Linear regression vs r-squared

    I am trying to understand how linear regression and R-squared differ. 1 - Can anyone give me an example of use of linear regression and R-squared? 2 - They have some relation between them? E.g., they are useful for each other? 3 - What are the dangers when analysing the linear regression...
  19. J

    Understanding R Squared Regression: A Guide for Josh

    http://imgur.com/LAOgGyY http://imgur.com/LAOgGyY I know the components of R2 as in ESS, TSS, RSS. I know cov(x,y) = [\sum(yi - ybar)(xi - xbar)]/n-1 But that's as far as I can go, I have come across lowercase r yet or the sample correlation and proofing how they all fit is a bit beyond...
  20. C

    Ideal Gas Heat Capacity Regression Coefficients

    I'm trying to find a comprehensive list of the empirical coefficients to be used in the following equation for calculating ideal gas constant pressure heat capacities: \frac{c^{IG}_P}{R}=A+BT+CT^{2}+DT^{-2}+ET^{3}(Eqn. 1) cPIG is the ideal gas constant pressure specific heat capacity; R is...
  21. M

    Regression with uncertain data

    So I have this set of statistical data, which is not completely relevant to what I want to model, and I would like to compensate for that somehow since I do not have the more precise data. I have about 500 observations of average wages in certain areas which are modeled as dependent on several...
  22. J

    Least squares estimator and distribution (simple linear regression)

    Homework Statement Under the simple linear regression model Y= A + Bx + e, where A is the intercept (a known concept), B is the slope parameter (unknown) and e is a random error term satisfying the normality assumption. If (X1,Y1)...(Xn,Yn) are the n data points observed, find the least squares...
  23. D

    Linear regression. How to calculate this problem

    Homework Statement Why does excel give me this: http://postimg.org/image/68b9z1lqt/ And various online calulators (for example http://www.alcula.com/calculators/statistics/linear-regression/), and my own calulations give me this: http://postimg.org/image/kpljm4awx/ Homework...
  24. Z

    Coding Variables for linear regression

    Homework Statement I have to design an experiment with 3 factors. One factor has to be quantitative with at least 3 levels. One Qualitative with at least 3 levels. And the last one can be either quant/qual with at least 2 levels. My question is in regards to coding the variables. For example...
  25. Z

    Multiple linear Regression Expreiment

    Homework Statement Hi, I need to create an experiment for my regression class and I would appreciate some ideas which would allow me to generate the data with minimal resources (preferably something on the computer where I can get data instantly) The main criteria for the experiment...
  26. Z

    Multiple least squares regression

    Homework Statement design a regression model that will use the dataset y trial x1 x2 x3 0.08536, 1, -1, -1, -1.00000 0.09026, 2, -1, -1, -1.00000 0.10188, 1, -1, -1, -0.33333 0.09301, 2, -1, -1, -0.33333 0.10362, 1, -1, -1, 0.33333 0.09920, 2, -1, -1...
  27. K

    Simple Pendelum - Help with Quadratic Regression on Excel?

    1. I am trying to determine the value of g based on a simple pendulum. I have graphed a scatter plot with the x-axis as L and the y-axis as P. I have 3 different series for different lengths of pendulum. When I try to find a line of best fit I get something like cx^2 + dx + e and this trend...
  28. R

    MHB Demand and Regression Analysis

    A bit confused with this question. my answers are below each question. please help. Branded Products, Inc., based in Halfway Tree is a leading producer and marketer of household laundry detergent and bleach products. About a year ago, Branded products rolled out its new Super Detergent in four...
  29. Sudharaka

    MHB Jordan's Question from Facebook (About Regression)

    Jordan from Facebook writes: Help please,
  30. M

    Regression Analysis for Manuscript Correction Costs | Homework Help Needed

    Homework Statement Okay, so the problem I have goes like this: Shown below are the number of galleys for a manuscript (X) and the total dollar cost of correcting typographical errors (Y) in a random sample of recent orders handled by a rime specializing in technical manuscripts. Since Y...
  31. M

    Regression Model Estimator

    Homework Statement Assume regression model y_i = \alpha + \beta x_i + \epsilon_i with E[\epsilon_i] = 0, E[\epsilon^2] = \sigma^2, E[\epsilon_i \epsilon_j] = 0 where i \ne j. Suppose that we are given data in deviations from sample means. If we regress (y_i-\bar{y}) on (x_i-\bar{x}) without a...
  32. D

    Sigma Plot, Non-linear regression, fitting a line to a set of points

    I model arterial baroreflex data that I have collected in humans using the Kent equation which is: y=p1/(1+exp((x-p3)*p2))+p4; where Y=heart rate, X= estimated carotid sinus pressure, p1=range of Y, p2=slope coeff, p3=centerpoint on X, p4 = minimum Y. I use Sigma Plot to do a best fit line...
  33. N

    Linear Regression in Polar Space

    I have posted this question before but I don't think I was clear on what i was trying to do exactly. I am trying to simulate a set of muon detecting drift tubes in 2d space. I have 2 sets of detector tubes (shown as black circles in the image), a particle trajectory goes through all tubes...
  34. E

    Regression Analysis - Needed for research project

    Hello, I'm a second year mathematics and economics student, and I've been hired by an economic development organisation to conduct a research project on the probability of loan default in micro-credit borrowers in rural Kenya (I'll be heading there in person this summer). Basically, I'll...
  35. J

    How do I properly set up the matrix for polynomial regression?

    I'm trying to understand the derivation of polynomial regression. Given data points: [(-1,-1),(2,-1),(6,-2)]. So a 2nd degree curve will be a concave downward parabola. My calculator produces the equation: -0.0357x2+0.0357x-0.9285. Which fits the data good. But if I try to do it manually...
  36. N

    Linear regression to radii of multiple circles

    Hi, I am trying to simulate muon paths through drift tubes and I have ran into a problem performing a linear regression. I have generated simulated muon trajectories in 2 dimensions and they passes through my simulated drift tubes represented as black circles with a '+' in the center. As the...
  37. O

    Linear Regression: LineFit Method Explained

    I wasn't sure where to put this question. Can anyone tell me what method LineFit uses to perform linear regression with error in both coordinates? Thank you.
  38. S

    Question about multiple regression analysis

    Homework Statement My question is q.3 in the attachment. I don't really understand the scenario of the question. The Attempt at a Solution For (a), if X = 1, will the model become: y = (b_1)(E_1) + epsilon? So (b_i)'s are the slopes of the models? But what is the assumption of the...
  39. iVenky

    Are Equations for Linear Regression Right?

    I read about "Linear regression" and I want to make sure that what I read is right Just tell if these equations are right- Slope of line of regression for y on x is given by m=\frac{E(XY)-E(X)E(Y)}{E(X^{2})-[E(X)]^{2}} \\ m=\frac{Cov(XY)}{Var(X)} \\ m=\frac{ρσ_{x}σ_{y}}{σ_{x}^{2}} \\...
  40. W

    Realistic of regression model

    there are two regreesion model Eviews output... which one is more realistic? model 1: wagehat = 116.9916 + 8.303*IQ model 2: logwagehat = 5.88 + 0.0088*IQ both of them use same samples... do i need to know futher statistic information to judge which one is more realistic? such as...
  41. W

    Which Regression Model is More Realistic for Predicting Wages Based on IQ?

    there are two regreesion model Eviews output... which one is more realistic? model 1: wagehat = 116.9916 + 8.303*IQ model 2: logwagehat = 5.88 + 0.0088*IQ both of them use same samples... do i need to know futher statistic information to judge which one is more realistic? such as...
  42. C

    What's the difference between trends and regression lines?

    What's the difference between trends and regression lines (linear and non-linear) which minimize the residual sum of squares?
  43. S

    Is the estimator for regression through the origin consistent?

    Homework Statement Any help on this would be immensely appreciated! I am having trouble interpreting what my instructor is trying to say. Consider a simple linear regression model: y_i = \beta_0 + \beta_1x_i + u (a) In regression through the origin, the intercept is assumed to be equal to...
  44. L

    Statistics, regression model, unbiased estimator help

    Homework Statement Professor E.Z.Stuff has decided that the least squares estimator is too much trouble. Noting that two points determine a line, Dr. Stuff chooses two points from a sample of size N and draws a line between them, calling the slope of this line the E.Z. estimator of beta1 in...
  45. S

    Quadratic Regression Word Problems

    Homework Statement A rectangular parking lot is to be placed with one side along a building where no fence is required. If 1000 yards of wire are available for fencing the other 3 sides of the lot, what is the maximum area the lot may have, and what are the dimensions of the lot of a maximum...
  46. P

    Quadratic Regression calculation

    Hi, I'm learning statistic. Do you guys know how to calculate quadratic regression by hand, which is: give a data set (x,y), find a parabola f(x)=ax^2+bx+c that minimize the total square errors . I have known how to calculate linear regression. Thanks in advanced.
  47. B

    Effect size in multiple regression

    Hello, I'm using this this calculator to determine a rough sample size for a multiple regression (20 predictors). http://www.stattools.net/SSizmreg_Pgm.php I don't really understand the effect size? Could somebody tell me if you are using multiple regression (with 20 predictors)...
  48. J

    Regression Analysis: Finding Optimal Parameters for Non-Linear Functions

    Sorry if this is somewhat elementary but the regression form of the sine function with data provided is y=asin(b(x-c))+d As far as I know, all of the variables except c can be determined mathematically. My question is this, using calculus or any other method, is there a way to determine c...
  49. B

    Multiple regression, why use categories

    Hello, I have a question regarding multiple regression. I am reading a paper in which the author performed a multiple regression to predict the energy consumption of an electric car based on a 27 variables measured during journeys, such as speed and acceleration etc. The author...
  50. B

    Interpreting regression output

    Hi there, I have performed a linear regression of the energy consumption of an electric car on 7 measured parameters. They are 1. Max acceleration 2. Average Acceleration 3. Average Velocity 4. Distance travelled 5. Standard Deviation Acceleration 6. Standard Deviation Velocity...
Back
Top