What is Regression analysis: Definition and 39 Discussions

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome variable') and one or more independent variables (often called 'predictors', 'covariates', or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane). For specific mathematical reasons (see linear regression), this allows the researcher to estimate the conditional expectation (or population average value) of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters (e.g., quantile regression or Necessary Condition Analysis) or estimate the conditional expectation across a broader collection of non-linear models (e.g., nonparametric regression).
Regression analysis is primarily used for two conceptually distinct purposes. First, regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Second, in some situations regression analysis can be used to infer causal relationships between the independent and dependent variables. Importantly, regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset. To use regressions for prediction or to infer causal relationships, respectively, a researcher must carefully justify why existing relationships have predictive power for a new context or why a relationship between two variables has a causal interpretation. The latter is especially important when researchers hope to estimate causal relationships using observational data.

View More On Wikipedia.org
  1. S

    Experimenting whether the diameter of a pot affects boiling time for water

    Hey guys, I'll try to be as direct as possible. So for school i'm doing an experiment at home trying to find out if the diameter of a pot affects the time it takes to boil water inside the pot as it says in the title. I had three different pots with three different diameters. I got half a liter...
  2. T

    I Vector visualization of multicollinearity

    General linear model is $$y=a_0+\sum_{i=1}^{i=k} a_i x_i$$ In regression analysis one always collects n observations of y at different inputs of ##x_i##s. n>>k or there will be many problems. For each regressor, and response y ,we tabulate all observations in a vector ##\textbf{x}_i## and...
  3. F

    I Exploring Nonlinear Least Squares for Regression Analysis

    Hello, Regression analysis is about finding/estimating the coefficients for a particular function ##f## that would best fit the data. The function ##f## could be a straight line, an exponential, a power law, etc. The goal remains the same: finding the coefficients. If the data does not show a...
  4. W

    A Error in (Multi)linear Regression

    Hi, I keep reading varying accounts on conditions needed to " justify" the use of ( multi) linear regression to model data. Specifically, I have seen several authors require errors to be normal, i.i.d , whilr others only require the errors be i.i.d with mean 0. Just where is the assumption of...
  5. D

    MHB Require assistance with possible multiple regression analysis

    I am interested in determining more efficient ways of determining individuals' body fat percentage. To do this, I measure the circumference of a number of segments (10 of them) of the body and determine the person's percentage body fat through underwater weighing. I have done this for 252 total...
  6. M

    A Regression analysis: logarithm or relative change?

    Hi. I am currently studying the market for equity options and the use of these to predict stock return around company earnings announcements. The dependent variable in my regression analyses have been the relative change in stock price or log-return from the day before the announcement to...
  7. O

    B Megastat's Regression Analysis keeps asking confidence level

    Hi, Anyone out there using Megastat? The course I am in requires using it/knowing how to use it for processes. Whenever I try to get a regression analysis it insists that I need to set a confidence level - I've tried different versions of typing 95%/0.95 into no avail, and I don't know what...
  8. R

    Conceptual Question regarding hypothesis testing regression

    Homework Statement Hi, I had a question regarding testing a regression models coefficients. Say there is a regression model that has the form: y = b0 + b1x1 + b2x2 + b3x3 + b4x4 + e For the sake of simplicity let: e be the random error, x1 is age, x2 is severity, and x3 is anxiety. y is...
  9. R

    A F-test regression test, when and how?

    I am aware that f-tests can be used to check the null hypothesis when comparing regression models if the models are nested. What I am confused about is if I can apply an f-test to compare the following, (and if so what is the best way) I have two regression laws Y = a1*X1 + a2*X2 + b Y =...
  10. W

    A Composing Likert "Subvariables" into a Single Variable

    Hi All, I have many Likert variables regarding a single item issue. Specifically, I am dealing with several measures of IT Dept Quality, like % of budget devoted to IT department, Number of External Audits, etc ; each is measured on a Likert scale. I ultimately want to regress EDIT against...
  11. F

    How can I use the expression for a in this problem

    Homework Statement A random sample of size ##n## from a bivariate distribution is denoted by ##(x_r,y_r), r=1,2,3,...,n##. Show that if the regression line of ##y## on ##x## passes through the origin of its scatter diagram then[/B] $$\bar y \sum^n_{r=1} x_r^2=\bar x\sum^n_{r=1} x_r y_r$$ where...
  12. W

    A Follow-Up on F-Test in Multi-Linear Regression

    Hi All, Say we want to linearly regress Y (dependent) against ## X_1, X_2,..., X_n ## (Independent) , all numerical variables to get a model ## Y=a_1X_1+...+a_n X_n ## . Then we test ## H_0 ## for whether : ##H_0: 0= a_1= a_2 =...=a_n ## ## H_1 : a_i \neq 0 ## for some ## i=1,2,..,n ##...
  13. M

    I How do I find the standardized coefficients?

    hi. I am using Stata to do a regression on longitudinal data. However, it does not produce numbers for standardized coefficients. How do I make stata produce the standardised coefficients as part of the regression operation? Mons
  14. W

    I Is Adjusting for Weight in TEE Calculation Reliable in Regression Analysis?

    The article Energy expenditure in adults living in developing compared with industrialized countries: a meta-analysis of doubly labeled water studies has the following shocking conclusion: The authors argued that the lack of physical activities in industrialized countries had little effect on...
  15. W

    A "Many-to-One" Mapping of Variables in Logistic Regression

    Hi all, I have logistically- regressed 3 different numerical variables ,v1,v2,v3 separately against the same variable w . All variables have the same type of S-curve (meaning, in this case, that probabilities increase as vi ; i=1,2,3 increases ). Is there a way of somehow joining the three...
  16. iCloud

    A Regression analysis and Time Series decomposition

    If we can use Regression analysis to forecast, why do we use “Time Series Decomposition”? What's the difference between the 2? Thanks
  17. W

    I Lack of Fit in Ordinal Regression -- Analysis/Alternatives?

    Hi All, I ran a binary logistic of Y on three different numerical variables A,B,C respectively. I am having an issue of separation of variables with all of them, meaning that there are values Ao,Bo, Co for each of A,B,C (different values for each, of course) so that for ## A>Ao, B>Bo...
  18. W

    I Are there Issues with Separation of Values in Ordinal Logistic Regression

    Hi all , just curious if someone knows of any issues of Separation of Points in Ordinal 3-valued Logistic Regression. I think I have an idea of why there are issues with separation in binary Logistic -- the need for the S-curve to go to 0 quickly makes the Bo term go to infinity. Are there...
  19. D

    I Regression: which parameters to use and how to plot the data

    Hello! I am yet very weak in statistics, but I am learning some basic finance, and this requires to create regression. Please, take a look at attached files - one excel that contains the results of regression and one screen shot of the window of StatPlus that I have to fill in. Before using my...
  20. C

    Underdetermined vs Overdetermined Systems

    I'm trying to create a model which is of the form y = (a0 + a1l)[b0+MΣm=1 bmcos(mx-αm)] [c0 + NΣn=1 cn cos(nz-βn)] In the above system, l,x and z are independent variables and y is the dependent variable. The a, b and c terms are the unknowns. To solve for these unknowns, I have two separate...
  21. A

    Excel trend line vs regression analysis

    I did some data analysis with excel fitting some linear, zero intercept data with trend line and the regression analysis tool. The slopes generated by the two methods were different by about 10%. The regression line seemed to be weighted differently, are these two methods different for some...
  22. E

    Regression Analysis of Tidal Phases

    I have some 3-D model output for a river system that is tidally forced at the entrance. Right now, I'm trying to perform some linear regression on the harmonic constants of various tidal constituents at for several locations along the river compared to the observed tidal data. A linear...
  23. S

    Regression Analysis on Theoretical Model

    Hi everyone. I'm a graduate student and am struggling with something that may possibly be trivial. So, my research is creating a mathematical model to represent a real system. I have data points from my real system that I want to compare my model to. How do I do a regression analysis and get...
  24. F

    Regression Analysis - Constructing a Model ((Need ))

    Hello, I am trying to construct a general function/method based on two sets of minimum/maximum data point constraints, which can take on new values in different situations. The only known data for this general function is the starting point (y-axis intercept) and the x-range. The rate of...
  25. R

    MHB Demand and Regression Analysis

    A bit confused with this question. my answers are below each question. please help. Branded Products, Inc., based in Halfway Tree is a leading producer and marketer of household laundry detergent and bleach products. About a year ago, Branded products rolled out its new Super Detergent in four...
  26. E

    Regression Analysis - Needed for research project

    Hello, I'm a second year mathematics and economics student, and I've been hired by an economic development organisation to conduct a research project on the probability of loan default in micro-credit borrowers in rural Kenya (I'll be heading there in person this summer). Basically, I'll...
  27. S

    Question about multiple regression analysis

    Homework Statement My question is q.3 in the attachment. I don't really understand the scenario of the question. The Attempt at a Solution For (a), if X = 1, will the model become: y = (b_1)(E_1) + epsilon? So (b_i)'s are the slopes of the models? But what is the assumption of the...
  28. B

    Regression analysis sample size problem

    Hi there, Could anybody offer any advice on a linear regression sample size problem? I am using regression to predict the energy consumption (watt/mile) of an electric car based on a number of parameters such as average velocity, max velocity, average acceleration, the number of stops...
  29. A

    Regression Analysis: Does it Make Sense?

    Does this even make sense? Am told to do a multiple regression analysis. The response variable and the explanatory variables add up and should give up ~100 percent of the total product. Example: Milk = water + fat + protein ~= 100% (all are in terms of percentages) The regression I was...
  30. D

    Regression Analysis Homework: Best Model & Appropriateness

    Homework Statement based on this data http://www.stat.ufl.edu/~rrandles/sta4210/4210lectures/secondexreview/exam2rev.pdf 1) Consider the full (three predictor) model. Is this model useful? (are any of the predictors worthwhile?) 2) Use the All-Subsets and conduct a search for the best...
  31. D

    Regression Analysis Homework: X1, X3 & Locadv

    Homework Statement based on this data http://www.stat.ufl.edu/~rrandles/sta4210/4210lectures/secondexreview/exam2rev.pdf 1) Consider the full (three predictor) model. Is this model useful? (are any of the predictors worthwhile?) 2) Use the All-Subsets and conduct a search for the...
  32. J

    Multiple regression analysis, econometrics, and statistics

    I am sooo lost in this class, please help. 1. Let the true (population) model be y = B0+B1x1+B2x2+u where u is an unobserved error term with u (conditional) x1, x2 and N(0, sigma^2). Hence, u is normally distributed with mean 0 and variance sigma^2 (i.e., E[u (conditional) x1, x2] = 0 and V...
  33. R

    Financial Model Backtesting and Regression Analysis?

    Hello everyone, I have a homework assignment in my financial mathematics class and I don't fully understand it, so here is my problem: I am supposed to backtest a given data set to see if a financial model works, in particular, for 30 maturity dates of the treasuries (bonds) I had to see how...
  34. H

    Multiple Linear Regression Analysis

    Hi, I've asked this question on another forum, but no response until now. Maybe I will have a little bit of luck here. So .. I have a problem. I have a set of 8 parameter and I use this parameters in order to compute a measure (I vary each parameter with a step of 50%). I would like to know...
  35. J

    Performing Regression Analysis on Excel: Two Methods

    Does anyone know how to do a regression analysis on excel? I need to find the correlation coefficient and the coefficient of determination and I only know how to do that with a graphing calculator TI 83+
  36. S

    Least Squares Regression Analysis - No Idea

    Hello, I am a first year undergraduate university student majoring in Engineering and Computing Sc. One of my courses is Linear Algebra. We have been given an assignment in which question no. 2 is out of syllabus. It is on Least Squares Regression Analysis. This has not been taught to us. We...
  37. F

    Regression analysis - case of multicollinearity

    What are some of the elementary remedial procedures to multicollinearity (VIF >= 10) in linear regression? We were told to simply just drop that particular independent variable, but someone else suggested we could center the predictor variables (ie., xi = Xi - Xbar). Can somebody explain why...
  38. D

    Regression Analysis: Most Sophisticated Methods & Least Squares

    What are the most sophisticated methods of performing regression analysis and how does least squares rank among them? Additionally which category would the least squares method fit into below (if any): Simple, Multiple, Non-linear, Robust, Ridge, Logistic Thanks, -Diffy
  39. C

    Regression Analysis for a Gamma function

    [SOLVED] Regression Analysis for a Gamma function My regression analysis program that I developed in BASICS back in the 1980's applies for half a dozen linear equations some of which are transormed into log forms. I would like to modify my program to include this Gamma function...