Differential of Multiple Linear Regression

In summary, when looking at a log-level regression model and trying to interpret changes in Y due to changes in Xk, we can take the partial derivative with respect to Xk to get the formula (dY/Y) = βk * dXk. However, to get the total derivative wrt Xk, we must use the general formula (dY/dXk) = ∑βj * (dXj/dXk). This reduces to the simplified formula if there are no dependences between Xk and any other Xj. Therefore, we can say that Y increases by Yβk * δXk if Xk increases by δXk and all other variables remain constant.
  • #1
fny
Say you have a log-level regression as follows:

$$\log Y = \beta_0 + \beta_1 X_1 + \beta_1 X_2 + \ldots + \beta_n X_n$$

We're trying come up with a meaningful interpretation for changes Y due to a change in some Xk.

If we take the partial derivative with respect to Xk. we end up with

$$\frac{dY}{Y} = \beta_k \cdot dX_k$$

which implies that if Xk. increases by 1, you expect Y to increase by 100βk percent.

Can someone walk through the calculus to get from this

$$\frac{\partial}{\partial X_k} \log{y}= \frac{\partial}{\partial X_k} (\beta_0 + \beta_1 X_1 + \beta_1 X_2 + \ldots + \beta_n X_n)$$

to this

$$\frac{dY}{Y} = \beta_k dX_k$$?

I'm particularly confused about how one transitions from a partial derivate to a total derivative.
 
Physics news on Phys.org
  • #2
In general one cannot make that transition. The second last formula is correct but the last is not, unless there is no dependence between ##X_k## and any of the other ##X_j##s. A corrected version of the last formula is:
$$
\frac{dY}{Y} = \sum_{j=1}^n \beta_j dX_j
$$

To get the total derivative wrt ##X_k## we use the total derivative formula, for the case where ##Y## is a function of ##X_1,...,X_n##:

$$\frac{dY}{dX_k}=\sum_{j=1}^n \frac{\partial Y}{\partial X_j} \frac{dX_j}{dX_k}$$

In this case we have ##Y = \exp\left(\beta_0 + \sum_{k=1}^j \beta_j X_j\right)## so that ##\frac{\partial Y}{\partial X_j} = \beta_jY##, and we also have ##\frac{d X_k}{d X_k}=1##, so that the total derivative becomes:
$$\frac{dY}{dX_k}=Y\left(\beta_k + \sum_{\substack{j=1\\j\neq k}}^n \beta_j \frac{dX_j}{dX_k}\right)$$

This reduces to the formula you wrote above for the total derivative if all the ##\frac{dX_j}{dX_k}## are zero, ie if there are no dependences between ##X_k## and any of the other ##X_j##.

What we can say is that ##Y## increases by ##Y\beta_k\delta X_k## if ##X_k## increases by ##\delta X_k## and all other variables do not change.
 
  • Like
Likes fny and FactChecker

1. What is the purpose of using multiple linear regression?

Multiple linear regression is a statistical method used to analyze the relationship between two or more independent variables and a dependent variable. It is often used to predict or explain the values of the dependent variable based on the values of the independent variables.

2. How is the differential of multiple linear regression calculated?

The differential of multiple linear regression is calculated using the method of ordinary least squares. This involves finding the line of best fit that minimizes the sum of the squared differences between the actual values and the predicted values. The differential is then calculated as the change in the predicted values for a one-unit change in the independent variable.

3. What is the difference between simple linear regression and multiple linear regression?

The main difference between simple linear regression and multiple linear regression is the number of independent variables used in the analysis. Simple linear regression uses only one independent variable, while multiple linear regression uses two or more independent variables. This allows for a more complex and nuanced analysis of the relationship between the variables.

4. How do you interpret the coefficients in a multiple linear regression model?

The coefficients in a multiple linear regression model represent the expected change in the dependent variable for a one-unit change in the corresponding independent variable, holding all other variables constant. A positive coefficient indicates a positive relationship, while a negative coefficient indicates a negative relationship. The size of the coefficient also reflects the strength of the relationship.

5. What are some common assumptions of multiple linear regression?

Some common assumptions of multiple linear regression include linearity (the relationship between the variables is linear), normality (the residuals are normally distributed), homoscedasticity (the variance of the residuals is constant), and independence (the residuals are not correlated with each other). Violations of these assumptions can affect the accuracy and reliability of the regression model.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
850
Replies
3
Views
742
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
Replies
1
Views
810
Replies
1
Views
4K
  • Quantum Physics
Replies
1
Views
547
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
4K
Back
Top