AI Computation (self study analysis, pointers welcome)

In summary, the author is trying to find a closed form solution to the gradient descent problem for a single sigmoid neuron. The problem is complicated and the author is unsure about how to calculate the gradient.
  • #1
Chenkel
482
108
TL;DR Summary
AI Computation problem statement: represent mathematically single sigmoid neuron with arbitrary number of weights and inputs, calculate back propagation formula using stochastic gradient descent.

I'm looking to improve my understanding of the algorithm, and hopefully create a thread that could be useful to someone facing a similar AI problem.
In this thread I attempt to find a closed form solution to the gradient descent problem for a single sigmoid neuron using basic calculus.

If you would like to give pointers feel free, if you see me make a mistake please let me know!

Thank you!
 
Technology news on Phys.org
  • #2
Activation function:
$$\sigma(z) = \frac 1 {1+e^{-z}}$$Inputs to sigmoid neuron:
$$I=(I_1, I_2, ..., I_m)$$Weights:$$w = (w_1, w_2, ..., w_m)$$Output of sigmoid neuron:$$\theta(w, b) = \sigma(z(w, b))$$where$$z(w, b) = I \cdot w + b$$Cost function ##f(\theta)## for expected value ##E##:$$f(\theta) = (\theta - E)^2$$Gradient of ##f(\theta)##$$\nabla f = ((\frac {\partial f} {\partial w_1}, ..., \frac {\partial f} {\partial w_m}), \frac {\partial f} {\partial b})$$$$\frac {\partial f} {\partial w_i} = 2(\theta - E)\frac {\partial \theta} {\partial w_i}$$$$\frac {\partial \theta} {\partial w_i} = \frac {d\sigma} {dz}\frac {\partial z} {\partial w_i}$$$$\frac {d\sigma} {dz} = -\frac {e^{-z}} {(1 + e^{-z})^2}=-(\frac 1 \sigma - 1)\sigma^2=\sigma(\sigma - 1)$$$$\frac {\partial z} {\partial w_i} = I_i$$$$\frac {\partial \theta} {\partial w_i} = I_i\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial w_i} = 2(\theta - E)I_i\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial b} = 2(\theta - E)\frac {\partial \theta}{\partial b}$$$$\frac {\partial \theta}{\partial b}=\frac {d\sigma}{dz}\frac {\partial z}{\partial b}=\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial b} = 2(\theta - E)\sigma(\sigma - 1)$$$$\frac {\partial f} {\partial w_i} = I_i\frac {\partial f} {\partial b}$$
 
  • #4
I'm still working on the problem, I want to add the learning rate parameters and show the equations for gradient descent. I'll be posting throughout the day. Feel free to post here to add to the discussion on the problem at hand.
 
  • #5
I am a little unsure I calculated the gradient properly, I'm a little confused about how to go about calculating the gradient of a vector valued function with weights w, and bias (scalar value) b. It seems a little messy and I'm wondering if there is a conceptual flaw about the way I am doing it.

Any feedback is welcome, thank you!
 
  • #6
The following is the formula to compute the new set of weights recursively using learning rate ##\eta##:$$(w, b) := (w, b) - \eta{\nabla}f = (w, b) - \eta((I_1\frac {\partial f} {\partial b}, I_2\frac {\partial f} {\partial b}, ..., I_m\frac {\partial f} {\partial b}), \frac {\partial f} {\partial b})$$Does this look correct?

Does anyone know what the ideal solution is to solve for the learning rate?
 

1. What exactly is AI Computation?

AI Computation, also known as Artificial Intelligence Computation, refers to the use of computer algorithms and models to mimic human intelligence and perform tasks that typically require human cognition. It involves the use of various techniques such as machine learning, deep learning, and natural language processing to analyze and interpret data, make decisions, and solve complex problems.

2. How can I learn about AI Computation through self-study?

There are several resources available for self-study of AI Computation, including online courses, tutorials, books, and research papers. Some popular options include Coursera's "AI for Everyone" course, MIT's "Intro to Deep Learning" course, and the book "Artificial Intelligence: A Modern Approach" by Stuart Russell and Peter Norvig. It is also helpful to practice coding and implementing algorithms to gain a deeper understanding of the concepts.

3. What are the main applications of AI Computation?

AI Computation has a wide range of applications, including natural language processing, image and speech recognition, autonomous vehicles, virtual assistants, and predictive analytics. It is also used in various industries such as healthcare, finance, transportation, and education to improve efficiency and decision-making processes.

4. What are some important pointers to keep in mind while studying AI Computation?

Some important pointers for studying AI Computation include having a strong foundation in mathematics and statistics, understanding different machine learning algorithms and their applications, staying updated with the latest developments and research in the field, and continuously practicing and experimenting with coding and implementing AI models.

5. What are the ethical considerations surrounding AI Computation?

As AI Computation becomes more advanced and integrated into our daily lives, there are ethical concerns that need to be addressed, such as bias in algorithms, data privacy, and potential job displacement. It is important for individuals studying AI Computation to also be aware of these issues and consider the ethical implications of their work.

Similar threads

Replies
10
Views
2K
  • Programming and Computer Science
Replies
15
Views
2K
  • STEM Academic Advising
Replies
16
Views
423
  • Science and Math Textbooks
Replies
13
Views
2K
  • Science and Math Textbooks
Replies
20
Views
2K
Replies
2
Views
1K
  • Science and Math Textbooks
Replies
17
Views
1K
  • Science and Math Textbooks
Replies
14
Views
3K
  • Programming and Computer Science
Replies
29
Views
3K
Back
Top