Graduate Understanding the Cost Function in Machine Learning: A Practical Guide

Click For Summary
The discussion centers on understanding the differentiation of a cost function in a neural network context, specifically the loss function defined as L=wE, where E=(G-Gest)^2 and G=F'F. The user is struggling to derive the derivative of the loss function with respect to F, which is stated to be proportional to F'(G-Gest). Clarifications are sought regarding the notation and the meaning of the variables involved, particularly the role of time in the functions and the significance of G-G not equating to zero. The conversation highlights the need for more context to fully grasp the mathematical framework being discussed. Additional resources, such as a specific paper, are referenced for further exploration of the topic.
emmasaunders12
Messages
43
Reaction score
0
Could someone please help me work through the differentiation in a paper (not homework), I am having trouble finding out how they came up with their cost function.

The loss function is L=wE, where E=(G-Gest)^2 and G=F'F

The derivative of the loss function wrt F is proportional to F'(G-Gest)

Can't seem to figure it out.

Thanks

Emma
 
Last edited:
Physics news on Phys.org
I have some trouble to understand you:

Do all functions depend on, say time ##t##, which the primes refer to? And why isn't ##G-G=0##? I first thought it could be the strange notation of a function, but then you defined a single ##G##. And last, could it be ##L \propto F(G-G)'##?
 
fresh_42 said:
I have some trouble to understand you:

Do all functions depend on, say time ##t##, which the primes refer to? And why isn't ##G-G=0##? I first thought it could be the strange notation of a function, but then you defined a single ##G##. And last, could it be ##L \propto F(G-G)'##?

Thanks for the response, its the loss function of a neural network, so I've corrected to G and Gest, primes refer to transpose
 
emmasaunders12 said:
Thanks for the response, its the loss function of a neural network, so I've corrected to G and Gest, primes refer to transpose
Perhaps someone else can help, but without a lot more context I have no idea what mathematically we are dealing with here.
 
PeroK said:
Perhaps someone else can help, but without a lot more context I have no idea what mathematically we are dealing with here.

The specific problem is described on page 4 here https://arxiv.org/pdf/1505.07376v3.pdf
 
Thread 'How to define a vector field?'
Hello! In one book I saw that function ##V## of 3 variables ##V_x, V_y, V_z## (vector field in 3D) can be decomposed in a Taylor series without higher-order terms (partial derivative of second power and higher) at point ##(0,0,0)## such way: I think so: higher-order terms can be neglected because partial derivative of second power and higher are equal to 0. Is this true? And how to define vector field correctly for this case? (In the book I found nothing and my attempt was wrong...

Similar threads

  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 39 ·
2
Replies
39
Views
6K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
1K
  • Sticky
  • · Replies 16 ·
Replies
16
Views
11K
Replies
2
Views
562
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 48 ·
2
Replies
48
Views
12K
Replies
6
Views
2K