What is Regression: Definition and 359 Discussions

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome variable') and one or more independent variables (often called 'predictors', 'covariates', or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane). For specific mathematical reasons (see linear regression), this allows the researcher to estimate the conditional expectation (or population average value) of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters (e.g., quantile regression or Necessary Condition Analysis) or estimate the conditional expectation across a broader collection of non-linear models (e.g., nonparametric regression).
Regression analysis is primarily used for two conceptually distinct purposes. First, regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Second, in some situations regression analysis can be used to infer causal relationships between the independent and dependent variables. Importantly, regressions by themselves only reveal relationships between a dependent variable and a collection of independent variables in a fixed dataset. To use regressions for prediction or to infer causal relationships, respectively, a researcher must carefully justify why existing relationships have predictive power for a new context or why a relationship between two variables has a causal interpretation. The latter is especially important when researchers hope to estimate causal relationships using observational data.

View More On Wikipedia.org
  1. W

    I Error while doing multi-linear regression with minitab.

    Hi, I am having trouble doing a multiple linear regression with Minitab . Here is a screenshot of the error message I keep getting: https://www.physicsforums.com/attachments/105043
  2. pluviosilla

    I Slope of LS line = Cov(X,Y)/Var(X). Intuitive explanation?

    The slope of a fitted line = Cov(X,Y)/Var(X). I've seen the derivation of this, and it is pretty straightforward, but I am still having trouble getting an intuitive grasp. The formula is extremely suggestive and it is bothering me that I can't quite see its significance. Perhaps, my mental...
  3. E

    I If you were to perform a linear regression of log10(B) vs log10(x)....

    If you were to perform a linear regression of log10(B) vs log10(x) what would you expect the slope to be? The expected relationship between B and x is B(x) = μoI(2πx)-1
  4. W

    I Multilinear Regression. Interpreting "Insignificant"p-values

    Hi all, Hi all, hope this is not too simple; please feel free to give me a reference if this is so. I want to know how to address having "insignificant" coefficients in my regression: I just did a multilinear regression returning ## y=a_1x_1+...+a_kx_k ## In the resulting analysis, two of...
  5. J

    A Linear Regression with Non Linear Basis Functions

    So I am currently learning some regression techniques for my research and have been reading a text that describes linear regression in terms of basis functions. I got linear basis functions down and no exactly how to get there because I saw this a lot in my undergrad basically, in matrix...
  6. L

    I Can I get thresholds from logistic regression coefficients?

    Hello, I remember an example of application of the logistic regression to medicine / epidemiology, which said (more or less) that the probability of a person having miocardial infarction was related to some variables such as age, cholesterol level, etc, and the equation included the various...
  7. S

    Regression Model Homework: E(Y)=Xθ, Cov(Y)=σ2W - 65 Chars

    Homework Statement Suppose that we observe ##i=1,2,\ldots,n## independent observations that can be modeled as follows: $$Y_i = i(\theta+\epsilon_i) \quad \text{where} \; \; \epsilon_i \sim N(0,\sigma^2).$$ 1. Write the above as a regression model, ##E(Y) = X\theta##, ##\text{Cov}(Y) =...
  8. D

    I Regression: which parameters to use and how to plot the data

    Hello! I am yet very weak in statistics, but I am learning some basic finance, and this requires to create regression. Please, take a look at attached files - one excel that contains the results of regression and one screen shot of the window of StatPlus that I have to fill in. Before using my...
  9. Josh Terrill

    B Linear regression with two data sets?

    I want to try to predict the USA summer highs using a linear regression. I know I can probably take data from the last 10 summers and plug that in, and use that to predict, but I'd like to use two data sources. 1 data source from the historical highs from past summers in the USA, and the 2nd...
  10. E

    I Isolate variables in nonlinear equation for regression

    Hi all, I have a nonlinear equation of the form: \frac{TP_x}{TP_R} = c_0 + c_1 U_R^n + c_2 \frac{T_R^2}{\sqrt{U_R}} This equation describes the relationship between tidal parameters and river discharge (velocity) in tidal rivers derived from the 1-D St. Venant equations. TPx is some tidal...
  11. L

    I Is it better to use a regression or a classification?

    Hi everyone, I am looking at an interesting problem at work, and I would like to hear your opinion about it, please. I'll try to explain it as concisely as I can. We took about 2500 chemical compounds and we tested them in some 'phenotypic assays'. The assays give as output an activity value...
  12. A

    MHB Correlation and Regression

    I just need a little assistance on the last two questions. I place per capita and death in L1 and L2 but I am lost. Country Per Capita Cigarette Consumption 3900...
  13. Iqbal94

    I Approximating and regression method

    Hi guys, I did a few sets of test in determining the natural frequency of a crane lifting loads. From that, I tried to find two constant from its initial function. a is the stiffness of the pole that was holding the crane b is the weight of the crane x is the weight of the load The tests...
  14. phosgene

    Comparing SD of data with RMSE of regression line

    Homework Statement I'm being asked to compare the standard deviation of a data set with the root mean square error of the regression line used to model the data, in order to determine the reliability of the regression line. Homework Equations Mean squared error = variance + bias squared The...
  15. FritoTaco

    Mathematical Modeling: Regression

    Homework Statement Restaurant Reservations for a given week were: Monday - 34 Tuesday - 27 (minimum value) Wednesday - 33 Thursday - 47 Friday - 58 Saturday - 61 (maximum value) Sunday - 42 Homework Equations I don't think you need these equations to help me solve what I'm stuck on Amplitude...
  16. W

    Multilinear Regression, test for Dependence?

    Hi All, Say we conduct a standard linear regression test of Y (dep) versus X (independent) Then there are tests to be made on whether there is a linear relationship between Y and X (with ##H_o ## being that m=0; m is the regression line slope versus ##H_A :m \neq 0 ##) Is there a similar test...
  17. C

    Non-parametric regression

    Hello all I am forced to get famiilar with this topic quickly and I am struggling with the following after reading it in a paper. Can someone help with the underlined ones. I also have a follow up question which I will introduce after this. From the paper...
  18. D

    MHB Find Regression Equation for y on x

    How do I determine the regression equation when not much information is given? For example: Given the following equations: 2x + y = 13 2x + 5y = 20, which one is the regression equation of y on x?
  19. D

    MHB Finding the line of regression

    Two normal equations are given : 5a + 10b = 40 10a + 25b = 95 What is the regression line of y on x? I can easily find the common points from both the equations but how do I find the regression coefficeint?
  20. S

    Least squares regression

    Homework Statement note a linear regression model with the response variable Y=(Y1..Yn) on a predictor variable X=(X1..Xn). the least squares estimates of the intercept and slope a(hat) and B(hat) are the values that minimize the function: (see attached image) and the problem reads on further...
  21. barryj

    Solve TI84 Regression Issue: 2 Calculators, 1 Result

    I have two TI84 calculators. I entered the following data into L1 and L2 and tried to do a linear regression. My friend also had a TI84 and entered the same data into his calculator. We checked, double checked, and triple checked to see that the data was the same in all three calculators. Here...
  22. Y

    Nonlinear Regression Curve Fitting

    This isn't a precise homework question, but this seemed like the most reasonable place to post. If not, please feel free to move it. I have a large set of data points that should fit to a known equation (the Drude-Smith model for conductivity) The equation the data should fit to is: σ(ω) =...
  23. R

    Obtaining standard deviation of a linear regression intercep

    Hello, I have an experiment that I'm trying to conduct where I measure quantity A and normalize by quantity B. I then want to report normalized quantity A with error bars showing standard deviation. Quantity B is obtained via a standard curve that I generated (8 data points measured once each...
  24. W

    Deciding Diminishing Returns based on Data (Regression)

    Hi All, I am thinking of the issue of diminishing returns re linear regression. Can it be determined/decided from the data itself, or is it decided just from the context? I was thinking of examples like that of grade vs daily study hours or (height )jump length vs year ( winner heights have...
  25. W

    Linear Regression with Many y for each x

    Hi, Say we collect data points ##(x_i,y_j)## to do a linear regression, but so that for each ##x_i ## we collect values ##y_{i1}, y_{i2},...,y_{ij} ## . Is there a standard way of doing linear regression with this type of dataset? Would we, e.g., average the ##y_{ij}## abd define it to be ##...
  26. J

    The linear in linear least squares regression

    It is my understanding that you can use linear least squares to fit a plethora of different functions (quadratic, cubic, quartic etc). The requirement of linearity applies to the coefficients (i.e B in (y-Bx)^2). It seems to me that I can find a solution such that a coefficient b_i^2=c_i, in...
  27. NATURE.M

    Logistic regression: Stochastic Gradient Ascent (in python)

    So I've been following through a online course in machine learning offered by Stanford university. I have been recently reading up on logistic regression and stochastic gradient ascent. Here is a link to the original notes: http://cs229.stanford.edu/notes/cs229-notes1.pdf (pages 16-19). Here...
  28. X

    Understanding multivariate linear regression

    I am trying to understand multivariate linear regression. I have a list of time that it took running processes based on several params, like % of cpu usage, and data read. Eg, I have a process that took 50 seconds to run, with a cpu usage of 70%, and the process read 10bytes of data. I have...
  29. J

    Software for finding a best-fit ellipse ()?

    Hello all, I'm calibrating a magnetometer sensor. An uncalibrated sensor will output values that will graph an ellipse. A calibrated sensor will output values that will graph a circle. Since there's data points all over the place, I'd like to find a piece of software that will take my data...
  30. J

    MHB Interpreting regression coefficients

    Hi Guys! I'm new here so I apologise if I'm posting in the wrong area but this looks right to me. So with my (very) limited knowledge of statistics I am trying to interpret my fixed effect regressions. My question is really simple to ensure that I correctly state what is going on with my...
  31. C

    Multiple linear regression

    I am doing a multiple linear regression on a dataset. It is test scores. It has three highly correlated variables being income, reading score, and math score. Obviously since the test score is the sum of the math score and reading score would it be appropriate to exclude them simply based off...
  32. A

    Excel trend line vs regression analysis

    I did some data analysis with excel fitting some linear, zero intercept data with trend line and the regression analysis tool. The slopes generated by the two methods were different by about 10%. The regression line seemed to be weighted differently, are these two methods different for some...
  33. M

    Linear regression and measured values

    So I'm trying to identify a system that happens to be a synchronus generator via linear regression. I've got a model with the unknown coefficients A, B and C, and the measured variables I, w and T according to I(w, T) = A*T + B*w + C 1. What I fear is that I could get multiple solutions that...
  34. H

    Errors of the slope and intercept of a regression line

    I have set of date with error bars of different length on my y values. I want to know what the error is on the slope and intercept of my line of best fit through this data. Is there a numerical way to calculate this that takes into account the fit of the regression line and the y error bars?
  35. quantumdude

    Meta Analysis with Several Regression Studies

    I have come across a problem that I need to solve, and it isn't your garden variety regression problem. It isn't even covered in any of my books, of which I have many. I need either a book title or an online PDF that covers this material. Suppose we have a response variable z_1 that depends on...
  36. H

    Error on regression line slope

    I'm currently trying to determine the error on the slope of a regression line and the y-intercept. My y values are: My y error is: My x values are: 27.44535013 0.03928063 136 29.78207524 0.07836946 44 27.4482858 0.0385213 143 27.27481069...
  37. Q

    How to - regression of noisy titration curve

    I'd appreciate advice on the correct statistical method to analyse a dataset - Dataset is basically a titration curve consisting of [0.5, 1, 2, 3, 4, 5, 6] pg of starting material and 8 replicates in each 'pg bin'. In 'stage 1' of the process each bin is labeled separately, in 'stage 2' all...
  38. F

    Is the correlation coefficient significant in this data set?

    I also made a graph which is not pictured. 1.) Calculate the least squares line. Put the equation in the form of: y-hat = a + bx. I got: y hat = 11.304 + 106.218x a.) Find correlation coefficient. Is it significant? (use the p-value to decide) I got: r = 0.913... no it...
  39. E

    Regression Analysis of Tidal Phases

    I have some 3-D model output for a river system that is tidally forced at the entrance. Right now, I'm trying to perform some linear regression on the harmonic constants of various tidal constituents at for several locations along the river compared to the observed tidal data. A linear...
  40. I

    MHB Statistsics Mathematics Problem: Linear Regression

    Find the equation of the regression line for the given data. then construct A SCATTER PLOT of the data and draw the regression line. (each pair of variables has a significant correlation.) then use the regression equation to predict the value of y for each of the given x- values, if meaningful...
  41. M

    Logistic Regression Research: 97% Concordance, No Sig Variables

    I am doing an independent research project and I have written a logistic regression program in SAS. The percent concordance is 97%, but hardly any variables are significant. Can anyone help me understand why this would happen?
  42. M

    Logistic Regression Research: 99% Concordance, Few Significance

    I am doing some research and running a SAS program using logistic regression. The concordance is 99%, but hardly any variables are significant. Can anyone help me understand what this means?
  43. 9

    Solve simple regression problem

    Homework Statement Salaries of NBA players related to average points per game: Salary = 1 500 000 + 0.9log(points), n = 150, r^2 = 0.14 1) What is a bad players salary? 2) Interpret the coefficient in front of the aggressor 3) What would be the advantages of running it in log-log form...
  44. 9

    Simple regression: not including the intercept term

    Homework Statement The simple regression model is y = α + βx + u, where u is the error term. If you don't include α, when is β unbiased? Homework Equations y = α + βx + u The Attempt at a Solution Not including α doesn't affect whether β is unbiased because α is a constant.
  45. gfd43tg

    Least squares regression outputting function handle

    Homework Statement The number of twists ##y## required to break a certain rod is a function of the percentages ##u## and ##v## of each of two chemical components present in the rod. The following function is proposed ##y(u, v) = a_{1} + a_{2} exp(u^{2}) + a_{3}\sqrt{v} + a_{4}uv## (1) where the...
  46. K

    MATLAB Geographically Weighted Regression in MATLAB

    does anyone know sample codes with explanation on computing Geographically Weighted Regression using MATLAB?I am a newbie of MATLAB.
  47. R

    Linear regression, sources for this wikipedia link

    I really like the derivations here http://en.wikipedia.org/wiki/Proofs_involving_ordinary_least_squares Could some one recommend a good book for them. I'm tired of googling these equations every time I want to use them. Thanks!
  48. H

    Individual Measurement Uncertainty vs. Standard Error of Regression

    Let's say a student does a simple experiment where she conducts 10 trials at each x value (at each value of the independent variable). She collects data over 30 x values, giving her 300 total trials. For each of the 30 x values, she averages the 10 y values and she calculates the standard...
  49. gfd43tg

    Linear regression with least squares for quadratic function

    Homework Statement We want to determine the coefficients of a polynomial of the form: ##p(x)=c_{1}x^2 +c_{2}x+c_{3}##The polynomial ##p(x)## must satisfy the constraint ##p(1)=1##. We would also like ##p(x)## to satisfy the following 4 constraints: ##p(−1)=5## ##p(0)=−1## ##p(2)=6##...
  50. B

    Regression formula (calculating trends)

    Hi guys, I have been tasked with the following question; The company has set a fixed budget for Period 1 of the financial year using the following figures as being 100% of the budget: Material costs £200 000 Labour costs £100 000 Variable overheads £50 000 Fixed...
Back
Top