AI Computation (self study analysis, pointers welcome)

AI Thread Summary
The discussion focuses on finding a closed form solution for the gradient descent problem related to a single sigmoid neuron using basic calculus. Key equations are presented, including the activation function, cost function, and gradients for weights and bias. The author expresses uncertainty about the correctness of their gradient calculations and seeks feedback on their approach, particularly regarding vector-valued functions. They plan to incorporate learning rate parameters into their equations and are looking for guidance on determining an ideal learning rate. The thread invites contributions and resources from other participants to enhance the analysis.
Chenkel
Messages
482
Reaction score
109
TL;DR Summary
AI Computation problem statement: represent mathematically single sigmoid neuron with arbitrary number of weights and inputs, calculate back propagation formula using stochastic gradient descent.

I'm looking to improve my understanding of the algorithm, and hopefully create a thread that could be useful to someone facing a similar AI problem.
In this thread I attempt to find a closed form solution to the gradient descent problem for a single sigmoid neuron using basic calculus.

If you would like to give pointers feel free, if you see me make a mistake please let me know!

Thank you!
 
Technology news on Phys.org
Activation function:
$$\sigma(z) = \frac 1 {1+e^{-z}}$$Inputs to sigmoid neuron:
$$I=(I_1, I_2, ..., I_m)$$Weights:$$w = (w_1, w_2, ..., w_m)$$Output of sigmoid neuron:$$\theta(w, b) = \sigma(z(w, b))$$where$$z(w, b) = I \cdot w + b$$Cost function ##f(\theta)## for expected value ##E##:$$f(\theta) = (\theta - E)^2$$Gradient of ##f(\theta)##$$\nabla f = ((\frac {\partial f} {\partial w_1}, ..., \frac {\partial f} {\partial w_m}), \frac {\partial f} {\partial b})$$$$\frac {\partial f} {\partial w_i} = 2(\theta - E)\frac {\partial \theta} {\partial w_i}$$$$\frac {\partial \theta} {\partial w_i} = \frac {d\sigma} {dz}\frac {\partial z} {\partial w_i}$$$$\frac {d\sigma} {dz} = -\frac {e^{-z}} {(1 + e^{-z})^2}=-(\frac 1 \sigma - 1)\sigma^2=\sigma(\sigma - 1)$$$$\frac {\partial z} {\partial w_i} = I_i$$$$\frac {\partial \theta} {\partial w_i} = I_i\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial w_i} = 2(\theta - E)I_i\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial b} = 2(\theta - E)\frac {\partial \theta}{\partial b}$$$$\frac {\partial \theta}{\partial b}=\frac {d\sigma}{dz}\frac {\partial z}{\partial b}=\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial b} = 2(\theta - E)\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial w_i} = I_i\frac {\partial f} {\partial b}$$
 
I'm still working on the problem, I want to add the learning rate parameters and show the equations for gradient descent. I'll be posting throughout the day. Feel free to post here to add to the discussion on the problem at hand.
 
I am a little unsure I calculated the gradient properly, I'm a little confused about how to go about calculating the gradient of a vector valued function with weights w, and bias (scalar value) b. It seems a little messy and I'm wondering if there is a conceptual flaw about the way I am doing it.

Any feedback is welcome, thank you!
 
The following is the formula to compute the new set of weights recursively using learning rate ##\eta##:$$(w, b) := (w, b) - \eta{\nabla}f = (w, b) - \eta((I_1\frac {\partial f} {\partial b}, I_2\frac {\partial f} {\partial b}, ..., I_m\frac {\partial f} {\partial b}), \frac {\partial f} {\partial b})$$Does this look correct?

Does anyone know what the ideal solution is to solve for the learning rate?
 
Thread 'Is this public key encryption?'
I've tried to intuit public key encryption but never quite managed. But this seems to wrap it up in a bow. This seems to be a very elegant way of transmitting a message publicly that only the sender and receiver can decipher. Is this how PKE works? No, it cant be. In the above case, the requester knows the target's "secret" key - because they have his ID, and therefore knows his birthdate.
I tried a web search "the loss of programming ", and found an article saying that all aspects of writing, developing, and testing software programs will one day all be handled through artificial intelligence. One must wonder then, who is responsible. WHO is responsible for any problems, bugs, deficiencies, or whatever malfunctions which the programs make their users endure? Things may work wrong however the "wrong" happens. AI needs to fix the problems for the users. Any way to...

Similar threads

Replies
16
Views
2K
Replies
20
Views
3K
Replies
6
Views
11K
Replies
7
Views
2K
Replies
71
Views
813
Back
Top