Linear Regression with Non Linear Basis Functions

Click For Summary

Discussion Overview

The discussion revolves around the application of linear regression using non-linear basis functions, specifically Gaussian basis functions. Participants explore the formulation of the model, the construction of the loss function, and the implications of using non-linear transformations in regression analysis.

Discussion Character

  • Technical explanation
  • Mathematical reasoning

Main Points Raised

  • One participant describes their understanding of linear regression with linear basis functions and seeks clarification on extending this to non-linear basis functions, particularly Gaussian functions.
  • Another participant confirms that the parameters to estimate are the coefficients ##w_i## and that the feature values ##(x_1,...,x_L)## are known for each data point.
  • A later reply reiterates the confirmation of the parameters and known values, indicating agreement on this point.
  • One participant explains that the term "linear" in linear regression refers to the coefficients rather than the functions used, providing a detailed formulation of the loss function and its minimization process.
  • The same participant presents the matrix notation for the regression model, clarifying the structure of the design matrix and the calculation of coefficients.
  • A subsequent response expresses appreciation for the clarity of the explanation, noting that it helped resolve confusion regarding the distinction between single and vector random variables in the context of the text being used.

Areas of Agreement / Disagreement

Participants generally agree on the formulation of linear regression with non-linear basis functions and the interpretation of the term "linear." However, the discussion does not resolve all potential uncertainties regarding the implications of using non-linear transformations.

Contextual Notes

The discussion does not address potential limitations or assumptions related to the choice of basis functions or the implications of using non-linear transformations in regression analysis.

joshthekid
Messages
46
Reaction score
1
So I am currently learning some regression techniques for my research and have been reading a text that describes linear regression in terms of basis functions. I got linear basis functions down and no exactly how to get there because I saw this a lot in my undergrad basically, in matrix notation
y=wTx
you then define your loss function as
1/n Σn(wi*xi-yi)2
then you take the partial derivatives with respect to w set it equal to zero and solve.

So now I want to use a non-linear basis functions, let's say I want to use m gaussians basis functions, φi, the procedure is the same but I am not sure exactly on the construction of the model. Let's say I have L features is the model equation of the form

ynmΣLwiφi(xj)

in other words I have created a linear combination of M new features, φ(x), which are constructed with all L of the previous features for each data point n:
yn=w0+w11(x1)+φ1(x2)...+...φ1(xL) ...+...wm1(x1)+φ2(x2)...+...φm(xL))

where xi are features / variables for my model and not data values? I hope this makes sense. Thanks in advance.
 
Physics news on Phys.org
The parameters you wish to estimate are the ##w_i## and the values ##(x_1,...,x_L)## are known for each data point?
 
micromass said:
The parameters you wish to estimate are the ##w_i## and the values ##(x_1,...,x_L)## are known for each data point?

That is correct.
 
Then you have a standard linear regression. Linear refers to the coefficients and not the functions used. Thus your loss function is again

L = \sum_{i=1}^n \left(y_i - w_0 - w_1\sum_k \phi_1(x_k) - w_2\sum_k \phi_2(x_k) - ... - w_N \sum_k \phi_N(x_k)\right)^2

and you minimize this by taking partial derivatives and setting them equal to ##0##. In matrix notation, you let ##Y## by the column matrix with entries the ##y_i## and you let ##X## be the design matrix whose ##i##th row is
\left(1~~\sum_k \phi_1(x_k)~~ ...~~\sum_k \phi_N(x_k)\right)
The coefficients are then ##W = (X^TX)^{-1} X^T Y##.
 
micromass said:
Then you have a standard linear regression. Linear refers to the coefficients and not the functions used. Thus your loss function is again

L = \sum_{i=1}^n \left(y_i - w_0 - w_1\sum_k \phi_1(x_k) - w_2\sum_k \phi_2(x_k) - ... - w_N \sum_k \phi_N(x_k)\right)^2

and you minimize this by taking partial derivatives and setting them equal to ##0##. In matrix notation, you let ##Y## by the column matrix with entries the ##y_i## and you let ##X## be the design matrix whose ##i##th row is
\left(1~~\sum_k \phi_1(x_k)~~ ...~~\sum_k \phi_N(x_k)\right)
The coefficients are then ##W = (X^TX)^{-1} X^T Y##.
Great Thanks, this is what I thought it meant but the way you wrote it makes it lot clearer than the text I am using which has all formulas in matrix notation and it hard to tell if they are talking about a single random variable or a vector of random variables.
 

Similar threads

  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 64 ·
3
Replies
64
Views
6K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 2 ·
Replies
2
Views
10K
Replies
10
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K