AI Computation (self study analysis, pointers welcome)

Chenkel · Oct 18, 2022

In this thread I attempt to find a closed form solution to the gradient descent problem for a single sigmoid neuron using basic calculus.

If you would like to give pointers feel free, if you see me make a mistake please let me know!

Thank you!

Chenkel · Oct 18, 2022

Activation function:
$$\sigma(z) = \frac 1 {1+e^{-z}}$$Inputs to sigmoid neuron:
$$I=(I_1, I_2, ..., I_m)$$Weights:$$w = (w_1, w_2, ..., w_m)$$Output of sigmoid neuron:$$\theta(w, b) = \sigma(z(w, b))$$where$$z(w, b) = I \cdot w + b$$Cost function ##f(\theta)## for expected value ##E##:$$f(\theta) = (\theta - E)^2$$Gradient of ##f(\theta)##$$\nabla f = ((\frac {\partial f} {\partial w_1}, ..., \frac {\partial f} {\partial w_m}), \frac {\partial f} {\partial b})$$$$\frac {\partial f} {\partial w_i} = 2(\theta - E)\frac {\partial \theta} {\partial w_i}$$$$\frac {\partial \theta} {\partial w_i} = \frac {d\sigma} {dz}\frac {\partial z} {\partial w_i}$$$$\frac {d\sigma} {dz} = -\frac {e^{-z}} {(1 + e^{-z})^2}=-(\frac 1 \sigma - 1)\sigma^2=\sigma(\sigma - 1)$$$$\frac {\partial z} {\partial w_i} = I_i$$$$\frac {\partial \theta} {\partial w_i} = I_i\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial w_i} = 2(\theta - E)I_i\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial b} = 2(\theta - E)\frac {\partial \theta}{\partial b}$$$$\frac {\partial \theta}{\partial b}=\frac {d\sigma}{dz}\frac {\partial z}{\partial b}=\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial b} = 2(\theta - E)\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial w_i} = I_i\frac {\partial f} {\partial b}$$

Chenkel · Oct 18, 2022

Resources:
http://neuralnetworksanddeeplearning.com/chap1.html
https://en.m.wikipedia.org/wiki/Gradient
https://en.m.wikipedia.org/wiki/Stochastic_gradient_descent

If you have any resources in mind that you want to share, please feel free to!

Thank you.

Chenkel · Oct 18, 2022

I'm still working on the problem, I want to add the learning rate parameters and show the equations for gradient descent. I'll be posting throughout the day. Feel free to post here to add to the discussion on the problem at hand.

Chenkel · Oct 18, 2022

I am a little unsure I calculated the gradient properly, I'm a little confused about how to go about calculating the gradient of a vector valued function with weights w, and bias (scalar value) b. It seems a little messy and I'm wondering if there is a conceptual flaw about the way I am doing it.

Any feedback is welcome, thank you!

Chenkel · Oct 20, 2022

The following is the formula to compute the new set of weights recursively using learning rate ##\eta##:$$(w, b) := (w, b) - \eta{\nabla}f = (w, b) - \eta((I_1\frac {\partial f} {\partial b}, I_2\frac {\partial f} {\partial b}, ..., I_m\frac {\partial f} {\partial b}), \frac {\partial f} {\partial b})$$Does this look correct?

Does anyone know what the ideal solution is to solve for the learning rate?

AI Computation (self study analysis, pointers welcome)

1. What exactly is AI Computation?

2. How can I learn about AI Computation through self-study?

3. What are the main applications of AI Computation?

4. What are some important pointers to keep in mind while studying AI Computation?

5. What are the ethical considerations surrounding AI Computation?

Similar threads

Hot Threads

Recent Insights