- #1

Dario56

- 290

- 45

Gradient descent is numerical optimization method for finding local/global minimum of function. It is given by following formula: $$ x_{n+1} = x_n - \alpha \nabla f(x_n) $$ There is countless content on internet about this method use in machine learning. However, there is one thing I don't understand and which I couldn't find even though it is basic.

What exactly is step size ## \alpha ## ?

Wikipedia states that it is tunning parameter in optimization algorithm which I understand, but not enough is being said about it to be considered a definition. Dimension analysis states that its dimensions should be ## \frac {(\Delta x )^2} {\Delta y} ## which I am not sure how to interpret.

What exactly is step size ## \alpha ## ?

Wikipedia states that it is tunning parameter in optimization algorithm which I understand, but not enough is being said about it to be considered a definition. Dimension analysis states that its dimensions should be ## \frac {(\Delta x )^2} {\Delta y} ## which I am not sure how to interpret.

Last edited: