AI Computation (self study analysis, pointers welcome)

Click For Summary
The discussion focuses on finding a closed form solution for the gradient descent problem related to a single sigmoid neuron using basic calculus. Key equations are presented, including the activation function, cost function, and gradients for weights and bias. The author expresses uncertainty about the correctness of their gradient calculations and seeks feedback on their approach, particularly regarding vector-valued functions. They plan to incorporate learning rate parameters into their equations and are looking for guidance on determining an ideal learning rate. The thread invites contributions and resources from other participants to enhance the analysis.
Chenkel
Messages
482
Reaction score
109
TL;DR
AI Computation problem statement: represent mathematically single sigmoid neuron with arbitrary number of weights and inputs, calculate back propagation formula using stochastic gradient descent.

I'm looking to improve my understanding of the algorithm, and hopefully create a thread that could be useful to someone facing a similar AI problem.
In this thread I attempt to find a closed form solution to the gradient descent problem for a single sigmoid neuron using basic calculus.

If you would like to give pointers feel free, if you see me make a mistake please let me know!

Thank you!
 
Technology news on Phys.org
Activation function:
$$\sigma(z) = \frac 1 {1+e^{-z}}$$Inputs to sigmoid neuron:
$$I=(I_1, I_2, ..., I_m)$$Weights:$$w = (w_1, w_2, ..., w_m)$$Output of sigmoid neuron:$$\theta(w, b) = \sigma(z(w, b))$$where$$z(w, b) = I \cdot w + b$$Cost function ##f(\theta)## for expected value ##E##:$$f(\theta) = (\theta - E)^2$$Gradient of ##f(\theta)##$$\nabla f = ((\frac {\partial f} {\partial w_1}, ..., \frac {\partial f} {\partial w_m}), \frac {\partial f} {\partial b})$$$$\frac {\partial f} {\partial w_i} = 2(\theta - E)\frac {\partial \theta} {\partial w_i}$$$$\frac {\partial \theta} {\partial w_i} = \frac {d\sigma} {dz}\frac {\partial z} {\partial w_i}$$$$\frac {d\sigma} {dz} = -\frac {e^{-z}} {(1 + e^{-z})^2}=-(\frac 1 \sigma - 1)\sigma^2=\sigma(\sigma - 1)$$$$\frac {\partial z} {\partial w_i} = I_i$$$$\frac {\partial \theta} {\partial w_i} = I_i\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial w_i} = 2(\theta - E)I_i\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial b} = 2(\theta - E)\frac {\partial \theta}{\partial b}$$$$\frac {\partial \theta}{\partial b}=\frac {d\sigma}{dz}\frac {\partial z}{\partial b}=\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial b} = 2(\theta - E)\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial w_i} = I_i\frac {\partial f} {\partial b}$$
 
I'm still working on the problem, I want to add the learning rate parameters and show the equations for gradient descent. I'll be posting throughout the day. Feel free to post here to add to the discussion on the problem at hand.
 
I am a little unsure I calculated the gradient properly, I'm a little confused about how to go about calculating the gradient of a vector valued function with weights w, and bias (scalar value) b. It seems a little messy and I'm wondering if there is a conceptual flaw about the way I am doing it.

Any feedback is welcome, thank you!
 
The following is the formula to compute the new set of weights recursively using learning rate ##\eta##:$$(w, b) := (w, b) - \eta{\nabla}f = (w, b) - \eta((I_1\frac {\partial f} {\partial b}, I_2\frac {\partial f} {\partial b}, ..., I_m\frac {\partial f} {\partial b}), \frac {\partial f} {\partial b})$$Does this look correct?

Does anyone know what the ideal solution is to solve for the learning rate?
 
Anthropic announced that an inflection point has been reached where the LLM tools are good enough to help or hinder cybersecurity folks. In the most recent case in September 2025, state hackers used Claude in Agentic mode to break into 30+ high-profile companies, of which 17 or so were actually breached before Anthropic shut it down. They mentioned that Clause hallucinated and told the hackers it was more successful than it was...

Similar threads

  • · Replies 15 ·
Replies
15
Views
3K
Replies
10
Views
4K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 0 ·
Replies
0
Views
1K
  • · Replies 20 ·
Replies
20
Views
3K
  • · Replies 6 ·
Replies
6
Views
11K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 71 ·
3
Replies
71
Views
1K
Replies
2
Views
2K