Optimizing Plane Fitting Using Stochastic Gradient Descent

Click For Summary
SUMMARY

This discussion focuses on optimizing plane fitting using Stochastic Gradient Descent (SGD) to minimize the objective function Q(w) for a dataset of points (x_i, y_i, z_i). The iterative update formula for the parameters w_1, w_2, and w_3 is clearly defined, incorporating the step size α. The user confirms the correctness of their implementation in C++, indicating successful convergence of the algorithm. The discussion emphasizes the practical application of SGD in fitting a 3D plane to data points.

PREREQUISITES
  • Understanding of Stochastic Gradient Descent (SGD)
  • Familiarity with objective functions in optimization
  • Basic knowledge of C++ programming
  • Concept of parameter updates in iterative algorithms
NEXT STEPS
  • Study the mathematical foundations of Stochastic Gradient Descent
  • Implement plane fitting using Python with libraries like NumPy
  • Explore advanced optimization techniques such as Adam or RMSprop
  • Learn about regularization methods to improve model robustness
USEFUL FOR

Data scientists, machine learning practitioners, and software developers interested in implementing optimization algorithms for 3D data fitting.

zzmanzz
Messages
47
Reaction score
0

Homework Statement



Suppose I wish to fit a plane
z = w_1 + w_2x +w_3y
to a data set (x_1,y_1,z_1), ... ,(x_n,y_n,z_n)

Using gradient descent

Homework Equations



http://en.wikipedia.org/wiki/Stochastic_gradient_descent

The Attempt at a Solution



I'm basically trying to figure out the 3-dimensional version of the example on wiki.
The objective function to e minimized is:

Q(w) = \sum_{i = 1}^n Q_i(w) = \sum_{i = 1}^n (w_1 + w_2x_i + w_3y_i - z_i)^2
I want to find the parameters of w_1,w_2,w_3

The iterative method updates the parameters w^{(0)}_1,w^{(0)}_2,w^{(0)}_3
1-step in the iteration
<br /> \left( \begin{array}{ccc}<br /> w^{(1)}_1 \\<br /> w^{(1)}_2\\<br /> w^{(1)}_3 \end{array} \right) = \left( \begin{array}{ccc}<br /> w^{(0)}_1 \\<br /> w^{(0)}_2 \\<br /> w^{(0)}_3 \end{array} \right) + \alpha \times \left( \begin{array}{ccc}<br /> 2(w^{(0)}_1 + w^{(0)}_2x_i + w^{(0)}_3 y_i - z_i) \\<br /> 2x_i(w^{(0)}_1 + w^{(0)}_2x_i + w^{(0)}_3 y_i - z_i) \\<br /> 2y_i(w^{(0)}_1+ w^{(0)}_2x_i + w^{(0)}_3 y_i - z_i) \end{array} \right)

\alpha [\tex] is the step size.
 
Last edited:
Physics news on Phys.org
Looks good. Whats your question?
 
Just wanted to make sure that I didn't cheat or something in my solution. When I run it in c++ it works very well.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 36 ·
2
Replies
36
Views
6K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
4
Views
2K
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
1
Views
2K