Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

MUST the variation of the action be zero?

  1. Nov 4, 2009 #1
    Feynman's path integral is:

    [tex]\[
    \int {Dx\,e^{{\textstyle{i \over \hbar }}\int {L(x,\dot x,t)dt} } }
    \][/tex]

    where the Action is:

    [tex]\[
    \int {L(x,\dot x,t)dt}
    \][/tex]

    and the Lagrangian is: [tex]\[
    {L(x,\dot x,t)}
    \]
    [/tex]

    Now we are told that as we functionally integrate the path integral in the region of far flung paths, contributions of the integral cancel out with other parts of other far flung paths, leaving only the classical path that contributes most to the final result. This classical part is also obtained by setting the functional derivative of the Action to zero and getting the Euler-Lagrange equations of motion.

    My question is if the functional derivative of the Action integral does not have a zero for any path, no classical path, then is it still possible to evaluate the path integral at all? Or will everything cancel out? Thanks.
     
  2. jcsd
  3. Nov 6, 2009 #2
    if you don't have a classical path..then it means you're still in the quantum domain. hence apply QM or QFT tools..
     
  4. Nov 6, 2009 #3
    The path integral is equated to <x1|x2> which is elsewhere also equated to the dirac delta function D(x1-x2). So the path integral itself equates to a Dirac delta function:

    [tex]\[
    < x|x_0 > \, = \,\,\delta (x - x_0 )\, = \int {{\rm{Dx}}\,{\rm{e}}^{\frac{{\rm{i}}}{\hbar }\int {{\rm{L(x,\dot x,t)dt}}} } }
    \][/tex]

    Now it occurs to me that the variation of the action in the exponential might be required to be zero if the functional derivative of the whole path integral were required to be zero. Taking the functional derivative of the whole path integral may result in a factor that is the functional derivative of the action integral in the exponential. So if the functional derivative of the whole thing were required to be zero, then this would require the action integral to have a variation of zero. So the question for me is what is the functional derivative of a Dirac delta function. Or what is,

    [tex]\[
    \frac{\delta }{{\delta (x)}}\,\delta (x - x_0 )
    \][/tex]

    My first instinct would be that it is 1. But when this is put in QM terms:

    [tex]\[
    \frac{\delta }{{\delta (x)}}\, < x|x_0 >
    \][/tex]

    Then perhaps it is zero if the states are eigenstates with discrete spectrum, for then there is no small variation of quantum states. Anyone have any idea what the meaning of this variation of an inner product would mean, physically or mathematically? Does this sound like a reasonable physical requirement? Thanks.
     
    Last edited: Nov 6, 2009
  5. Nov 9, 2009 #4
    OK, how does this sound? If we integrate the path integral one more time, it should equal 1 since it's a Dirac delta function. Then the functional derivative of that should be zero, since the functional derivative of a constant is zero. This much is necessarily True. But since the path integral is already an infinite number of integrations, I'm not sure that integrating it one more time would really affect its formulation. And then taking the functional derivative of that would probably create a factor consisting of the variation of the action. This would be required to be zero since nothing else could be set to zero.

    The path integral written out more fully is:

    [tex]\[
    \mathop {\lim }\limits_{n \to \infty } \,\,\int_{ - \infty }^{ + \infty } {\int_{ - \infty }^{ + \infty } {...\int_{ - \infty }^{ + \infty } {\prod\nolimits_{j = 1}^n {dx_j \,\left( {{\textstyle{m \over {2\pi i\hbar \varepsilon }}}} \right)^{{\raise0.5ex\hbox{$\scriptstyle 1$}
    \kern-0.1em/\kern-0.15em
    \lower0.25ex\hbox{$\scriptstyle 2$}}} \,\exp \left( {{\textstyle{i \over \hbar }}\,S[x\left( t \right)]} \right)} } } }
    \]
    [/tex]

    So if we integrate one more time the product would go from [tex]\[
    \prod\nolimits_{j = 1}^n {}
    \]
    [/tex] to [tex]\[
    \prod\nolimits_{j = 1}^{n + 1} {}
    \]
    [/tex]

    But what real difference would this make since they both go to infinity anyway?

    In any event, I have seen where the functional derivative commutes with integration. So the functional derivative of the path integral will commute with all the integrals (all infinity of them) and pass inside to the exponential. So functional differentiation commutes with functional integration, right?

    Then I assume:

    [tex]\[
    \delta \left( {\exp \left( {\frac{{\rm{i}}}{\hbar }{\rm{S[x}}\left( {\rm{t}} \right){\rm{]}}} \right)} \right){\rm{ = }}\exp \left( {\frac{{\rm{i}}}{\hbar }{\rm{S}}\left[ {{\rm{x}}\left( {\rm{t}} \right)} \right]} \right){\rm{\cdot}}\left( {\frac{{\rm{i}}}{\hbar }\delta {\rm{S}}\left[ {{\rm{x}}\left( {\rm{t}} \right)} \right]} \right)
    \]
    [/tex]

    And also that the only way this could be identically zero is if the variation of the action is zero. This might be where the principle of least action comes from.

    Some of this functional calculus doesn't seem to be very well developed in the literature I've seen. I'd appreciate those with more experience could check the accuracy of these assumptions. Thank you.
     
    Last edited: Nov 10, 2009
  6. Nov 13, 2009 #5
    Yes, this much is true, but is it actually useful? Does it apply to any functional whatsoever, or just those that equal the Dirac delta function?

    I do find it interesting that here we have the possibility of discovering the mathematical origin of the least action principle.

    It's curious to note that the path integral can be developed from principle alone from the identiy operator:

    [tex]\[
    1 = \int_{ - \infty }^{ + \infty } {\left| {x_1 } \right\rangle \left\langle {x_1 } \right|dx_1 }
    \]
    [/tex]

    And when this is applied an infinite number of times we get:

    [tex]\[
    1 = \int_{ - \infty }^{ + \infty } {\int_{ - \infty }^{ + \infty } {\int_{ - \infty }^{ + \infty } {...\int_{ - \infty }^{ + \infty } {\left| {x_1 } \right\rangle \left\langle {{x_1 }}
    \mathrel{\left | {\vphantom {{x_1 } {x_2 }}}
    \right. \kern-\nulldelimiterspace}
    {{x_2 }} \right\rangle \left\langle {{x_2 }}
    \mathrel{\left | {\vphantom {{x_2 } {x_3 }}}
    \right. \kern-\nulldelimiterspace}
    {{x_3 }} \right\rangle \left\langle {x_3 } \right|...\left| {x_n } \right\rangle \left\langle {x_n } \right|dx_1 dx_2 dx_3 ...dx_n } } } }
    \][/tex]

    where we can recognize [tex]\[
    \left\langle {{x_i }}
    \mathrel{\left | {\vphantom {{x_i } {x_j }}}
    \right. \kern-\nulldelimiterspace}
    {{x_j }} \right\rangle
    \]
    [/tex] as the inner product between orthonormal vectors. And when this identity is applied to a general inner product we get:

    [tex]\[
    \left\langle {{x_F }}
    \mathrel{\left | {\vphantom {{x_F } {x_0 }}}
    \right. \kern-\nulldelimiterspace}
    {{x_0 }} \right\rangle = \int_{ - \infty }^{ + \infty } {\int_{ - \infty }^{ + \infty } {\int_{ - \infty }^{ + \infty } {...\int_{ - \infty }^{ + \infty } {\left\langle {{x_F }}
    \mathrel{\left | {\vphantom {{x_F } {x_1 }}}
    \right. \kern-\nulldelimiterspace}
    {{x_1 }} \right\rangle \left\langle {{x_1 }}
    \mathrel{\left | {\vphantom {{x_1 } {x_2 }}}
    \right. \kern-\nulldelimiterspace}
    {{x_2 }} \right\rangle \left\langle {{x_2 }}
    \mathrel{\left | {\vphantom {{x_2 } {x_3 }}}
    \right. \kern-\nulldelimiterspace}
    {{x_3 }} \right\rangle \left\langle {x_3 } \right|...\left| {x_n } \right\rangle \left\langle {{x_n }}
    \mathrel{\left | {\vphantom {{x_n } {x_0 }}}
    \right. \kern-\nulldelimiterspace}
    {{x_0 }} \right\rangle dx_1 dx_2 dx_3 ...dx_n } } } }
    \]
    [/tex]

    If the vectors have finite components, then the [tex]\[
    \left\langle {{x_i }}
    \mathrel{\left | {\vphantom {{x_i } {x_j }}}
    \right. \kern-\nulldelimiterspace}
    {{x_j }} \right\rangle
    \]
    [/tex] is the Kronecker delta, [tex]\[
    \delta _j^i
    \]
    [/tex]. But if the vectors have an infinte number of components, then the inner product is the Dirac delta function [tex]\[
    \delta \left( {x_i - x_j } \right)
    \]
    [/tex].

    Then if we use the gaussian from of the delta function, [tex]\[
    \delta \left( {x - x_0 } \right) = \mathop {\lim }\limits_{\Delta \to 0} {\textstyle{1 \over {\left( {\pi \Delta ^2 } \right)^{{\raise0.5ex\hbox{$\scriptstyle 1$}
    \kern-0.1em/\kern-0.15em
    \lower0.25ex\hbox{$\scriptstyle 2$}}} }}}e^{ - \left( {x - x_0 } \right)^2 /\Delta ^2 }
    \]
    [/tex], then set [tex]\[
    \Delta = i^{{\raise0.5ex\hbox{$\scriptstyle 1$}
    \kern-0.1em/\kern-0.15em
    \lower0.25ex\hbox{$\scriptstyle 2$}}} \Delta t
    \][/tex], the exponential becomes [tex]\[
    i\left( {\left( {x - x_0 } \right)/\Delta t} \right)^2
    \][/tex], which can be recognized as the square of the velocity when t is considered the time variable. With this substitution for each of the [tex]\[
    \left\langle {{x_i }}
    \mathrel{\left | {\vphantom {{x_i } {x_j }}}
    \right. \kern-\nulldelimiterspace}
    {{x_j }} \right\rangle
    \]
    [/tex], the above integral becomes:

    [tex]\[
    \int_{ - \infty }^{ + \infty } {...\int_{ - \infty }^{ + \infty } {\prod\nolimits_{1 = 1}^n {dx_i } } } \,e^{ - i\int_0^t {\dot x^2 \,\,dt} }
    \][/tex]

    where each of the inner products contributes infinitesimally to the integral in the exponent. And apart from the constant m/2 in the exponent, this is the lagrangian for the kinetic energy of a free particle. And this is very similar to the way I've seen the path integral derived is some text books.

    If we think more generally, we can substitute x or even x' in the exponent with some other function or field. The requirement would be that these functions be square-integrable so that the the exponent exists. This is all part of the definition of a Hilbert space of functions that seem to be the only functions applicable. And then we are talking about quantum field theory when functions replace x, etc. (not necessarily physical fields). Wouldn't it be interesting if ALL of physics could be derived from nothing more than the Identity? Perhaps there are some interesting philosophical implications that can be gained from that.

    So it seems the path integral can be derived apart from physical concerns, and it may be that physics is just a part of these results.

    So one question in this context is if the least action principle can be required on principle alone as well. Or is this purely a requirement of physics?

    I've seen in the book, Quantum Field Theory, by Lowell S. Brown, page 13 that:

    [tex]\[
    {\textstyle{1 \over i}}{\textstyle{\delta \over {\delta f\left( t \right)}}}\left\langle {q',t_2 |q'',t_1 } \right\rangle ^f = \int {[dq]\,e^{i\int_{t1}^{t2} {dt'L_0 } } \,{\textstyle{1 \over i}}{\textstyle{\delta \over {\delta f\left( t \right)}}}\,e^{i\int_{t1}^{t2} {dt'\,f\left( {t'} \right)q\left( {t'} \right)} } \, = \,} \int {[dq]\,q\left( t \right)} \,e^{i\int_{t1}^{t2} {dt'\,\left[ {L_0 + f\left( {t'} \right)q\left( {t'} \right)} \right]} }
    \]
    [/tex]

    But it's just given as is, and they don't say how they derive this, or how the functional derivate commutes with integration and passes onto the exponential and then to the action. But it is as if to say,

    [tex]\[
    {\textstyle{1 \over i}}{\textstyle{\delta \over {\delta f\left( t \right)}}}\int {[dq]} \,e^{i\int_{t1}^{t2} {dt'\,\left[ {L_0 + f\left( {t'} \right)q\left( {t'} \right)} \right]} } \, = \,\int {[dq]} \,{\textstyle{1 \over i}}{\textstyle{\delta \over {\delta f\left( t \right)}}}\left( {e^{i\int_{t1}^{t2} {dt'\,\left[ {L_0 + f\left( {t'} \right)q\left( {t'} \right)} \right]} } } \right)
    \]
    [/tex]

    [tex]\[
    = \int {[dq]} \,e^{i\int_{t1}^{t2} {dt'\,\left[ {L_0 + f\left( {t'} \right)q\left( {t'} \right)} \right]} } {\textstyle{1 \over i}}{\textstyle{\delta \over {\delta f\left( t \right)}}}\left( {i\int_{t1}^{t2} {dt'\,\left[ {L_0 + f\left( {t'} \right)q\left( {t'} \right)} \right]} } \right)
    \]
    [/tex]

    [tex]\[
    = \int {[dq]} \,e^{i\int_{t1}^{t2} {dt'\,\left[ {L_0 + f\left( {t'} \right)q\left( {t'} \right)} \right]} } {\textstyle{\partial \over {\partial f\left( t \right)}}}\,\left[ {L_0 + f\left( {t'} \right)q\left( {t'} \right)} \right]\, = \,\int {[dq]} \,e^{i\int_{t1}^{t2} {dt'\,\left[ {L_0 + f\left( {t'} \right)q\left( {t'} \right)} \right]} } q\left( {t'} \right)
    \]
    [/tex]

    as I suspect is the case. So perhaps my suspicions are correct. Does anyone want to comment?

    It might be that physicists do not concern themselves too much with the details of functional calculus. They seem to only use it in passing to get the least action principle and move on from there. I feel that the question I raise here should be of interest to physicists. But perhaps this is more of a technical math question and should be moved to the appropriate forum. I'm not sure which forum would be most interested in functional derivatives and integrals.
     
    Last edited: Nov 13, 2009
  7. Nov 14, 2009 #6

    Fra

    User Avatar

    Here are some random ramblings.

    It seems you partly search for a deeper understanding of things, but you seek to seek a deduction of some of the laws of physics from deductive logic, is that right?

    I can related to your thinking, I seek something similar but in a different way where I replace the logical system with a more general one. I seek an inductive type of inference of the laws of physics from the a more general (not generally deductive) logic.

    As I see it, it's a sort of parallell to your vision, but we use different logic. You seem to seek a closed fixed logical system, mine is an open evolving one.

    In my think, the "least action principle" is treated on the same basis as "the second law". So Max ent rules and least action rules are two special cases of a more general idea. Conceptually I think this belongs in evolving inductive logic, rather than deductive.

    The connection is that the action take by a system, relates to it's expectations of the future. And the least action then suggest that each possible future has a "weight" according to it's plausability. The relation between least action and entropy is that least action is loosely a form of "state relative entropy". This unifies the two principles.

    The weighted possibilities, is quite analogous to a risk analysis, where the action takes the form of minimum speculation.

    A hint of this is described here
    http://en.wikipedia.org/wiki/Kullback–Leibler_divergence

    Although that is still a simplification, the full evolving idea is not incorporated there.

    So if we only can described the expected encoding information microstructure of a system, I think, it's expected actions should follow from such a intrinsic principle of minimum information divergence - close to - from "principle alone".

    minimum information divergence is in effect a principle of minimum speculation or least action.

    /Fredrik
     
  8. Nov 14, 2009 #7
    This seems like the reasonable thing to do.

    Inductive logic comes from deductive logic. Deductive logic is concerned with whether a proposition is true or false. Inductive logic counts the relative number of samples in various sets. But you have to be able to acknowledge whether it is true or false that a particular sample is included in your set before you can count how many there are.



    If you don't know how your system of reasoning is evolving, then you can't say you have a valid system that will give you reliable answers. You might be using the "old" system which no longer applies. But if you do know how your system of reasoning is evolving, then this is because you know the rules by which it evolves. But then these rules by which it evolves are now more basic and fixed and serve as a rigid logic to be used.

    It's interesting to note that the Dirac delta function is also considered a probability distributions. It's a gaussian distribution which represents completely random processes. In the development of the path integral (post 5) the inner product was a dirac delta function. The inner product also describes a transition from one state to the next. This is a kind of implication where the previous state necessarily results in the next state. This also may be a hint that deductive logic can be used to derive the path integral.

    So perhaps this structure of physics ( the path integral) from completely random processes (the Dirac delta gaussian distribution) is the evolving law that you're looking for.
     
    Last edited: Nov 14, 2009
  9. Nov 14, 2009 #8
    This somewhat confuses me because I thought that any variation of the action integral was considered to be identically zero. And this seems to be what he's doing. He then goes on to interate this process in order to form the definition of time ordered products. How can he do this?

    Is the general variation, [tex]\[
    \delta S
    \]
    [/tex], somehow different than a specific variation with respect to a function, [tex]\[
    \frac{\delta }{{\delta f\left( t \right)}}S
    \]
    [/tex] such that the first is zero when the second is not zero?
     
    Last edited: Nov 15, 2009
  10. Nov 16, 2009 #9
    Yes, this seems to be the case because what I found is that if

    [tex]\[
    S\left( {x,y\left( x \right),z\left( x \right)} \right) = \int_a^b {L\left( {x,y\left( x \right),z\left( x \right)} \right)} \,dx
    \]
    [/tex]

    then

    [tex]\[
    \delta S = \frac{{\partial L}}{{\partial y}}\delta y + \frac{{\partial L}}{{\partial z}}\delta z
    \]
    [/tex]

    but

    [tex]\[
    \frac{{\delta S}}{{\delta z}} = \frac{{\partial L}}{{\partial z}}
    \]
    [/tex]

    So there are extra terms in [tex]\[
    \delta S
    \]
    [/tex] that are not in [tex]\[
    \frac{{\delta S}}{{\delta z}}
    \]
    [/tex] which can be used to make first zero when the second is not.

    This makes me wonder if when we start adding fields in the Lagrangian to account for other kinds of particles and interactions, is there any care taken to make sure that the whole variation of the action remains zero? From what I've seen in my limited experience, they add terms to the lagrangian in a somewhat ad-hoc way to account for extra fields and self-interactions. I don't remember seeing any procedure where they take the time to check. Perhaps the coupling constants or the exponent of the fields are restricted in such a procedure. As just a wild quess, maybe this is where the cosmological constant problem comes in.
     
    Last edited: Nov 16, 2009
  11. Nov 20, 2009 #10
    As a brief introduction for purposes of context, functional derivatives are developed as follows:

    Let [tex]\[
    S\left[ {x,y\left( x \right),z\left( x \right)} \right]
    \]
    [/tex] be a functional of the functions y(x) and z(x).

    Then this functional, S, can be expanded in a taylor series

    [tex]\[
    S\left[ {x ,y_0(x) + h_y ,z_0(x) + h_z } \right] = \,S\left[ {x,y_0 ,z_0 } \right] + \frac{{\partial S}}{{\partial y}}|_{y_0 ,z_0 } h_y + \frac{{\partial S}}{{\partial z}}|_{y_0 ,z_0 } h_z + \frac{1}{{2!}}\left( {\frac{{\partial ^2 S}}{{\partial y^2 }}|_{y_0 ,z_0 } h_y ^2 + 2\frac{{\partial S}}{{\partial y}}\frac{{\partial S}}{{\partial z}}|_{y_0 ,z_0 } h_y h_z + \frac{{\partial ^2 S}}{{\partial z^2 }}|_{y_0 ,z_0 } h_z ^2 } \right) + ...
    \]
    [/tex]

    Or the finite difference is

    [tex]\[
    \Delta S = S\left[ {x,y + h_y ,z + h_z } \right] - S\left[ {x,y,z} \right] = \frac{{\partial S}}{{\partial y}}h_y + \frac{{\partial S}}{{\partial z}}h_z + \frac{1}{{2!}}\left( {\frac{{\partial ^2 S}}{{\partial y^2 }}h_y ^2 + 2\frac{{\partial S}}{{\partial y}}\frac{{\partial S}}{{\partial z}}h_y h_z + \frac{{\partial ^2 S}}{{\partial z^2 }}h_z ^2 } \right) + ...
    \]
    [/tex]

    where it is understood that we evaluate S using the functions [tex]\[
    ,y_0(x) ,z_0(x)
    \]
    [/tex]

    We can change notation so that [tex]\[
    h_y = \delta y
    \]
    [/tex] and [tex]\[
    h_z = \delta z
    \]
    [/tex] and get

    [tex]\[
    \begin{array}{l}\\
    \Delta S = S\left[ {x,y + \delta y,z + \delta z } \right] - S\left[ {x,y,z} \right] = \frac{{\partial S}}{{\partial y}}\delta y + \frac{{\partial S}}{{\partial z}}\delta z + \frac{1}{{2!}}\left( {\frac{{\partial ^2 S}}{{\partial y^2 }}\left( {\delta y} \right)^2 + 2\frac{{\partial S}}{{\partial y}}\frac{{\partial S}}{{\partial z}}\delta y\delta z + \frac{{\partial ^2 S}}{{\partial z^2 }}\left( {\delta z} \right)^2 } \right) + ... \\
    \end{array}
    \]
    [/tex]

    The first variation of S, labeled [tex]\[
    \delta S
    \]
    [/tex] , which is called the first functional derivative of S, is the linear part of the Taylor expansion, or

    [tex]\[
    \delta S = \frac{{\partial S}}{{\partial y}}\delta y + \frac{{\partial S}}{{\partial z}}\delta z
    \]
    [/tex]

    and the term with second power in [tex]\[
    \delta y
    \]
    [/tex] and [tex]\[
    \delta z
    \]
    [/tex] is called the second variation or second functional derivative, or

    [tex]\[
    \delta ^2 S = \frac{{\partial ^2 S}}{{\partial y^2 }}\left( {\delta y} \right)^2 + 2\frac{{\partial S}}{{\partial y}}\frac{{\partial S}}{{\partial z}}\delta y\delta z + \frac{{\partial ^2 S}}{{\partial z^2 }}\left( {\delta z} \right)^2
    \]
    [/tex]

    and similarly for higher order variations, [tex]\[
    \delta ^3 S,\,\delta ^4 S,\,...
    \]
    [/tex]

    And with this notation,

    [tex]\[
    \Delta S = \delta S + \frac{1}{{2!}}\delta ^2 S + \frac{1}{{3!}}\delta ^3 S + ...
    \]
    [/tex]


    What is [tex]\[
    \delta \left( {\delta S} \right)
    \]
    [/tex] ?

    [tex]\[
    \delta \left( {\delta S} \right) = \delta \left( {\frac{{\partial S}}{{\partial y}}\delta y + \frac{{\partial S}}{{\partial z}}\delta z} \right) = \frac{\partial }{{\partial y}}\left( {\frac{{\partial S}}{{\partial y}}\delta y + \frac{{\partial S}}{{\partial z}}\delta z} \right)\delta y + \frac{\partial }{{\partial z}}\left( {\frac{{\partial S}}{{\partial y}}\delta y + \frac{{\partial S}}{{\partial z}}\delta z} \right)\delta z
    \]
    [/tex]

    [tex]\[
    = \left( {\frac{{\partial ^2 S}}{{\partial y^2 }}\delta y + \frac{{\partial S}}{{\partial y}}\frac{\partial }{{\partial y}}\left( {\delta y} \right) + \frac{{\partial ^2 S}}{{\partial y\partial z}}\delta z + \frac{{\partial S}}{{\partial z}}\frac{\partial }{{\partial y}}\left( {\delta z} \right)} \right)\delta y + \left( {\frac{{\partial ^2 S}}{{\partial z\partial y}}\delta y + \frac{{\partial S}}{{\partial y}}\frac{\partial }{{\partial z}}\left( {\delta y} \right) + \frac{{\partial ^2 S}}{{\partial z^2 }}\delta z + \frac{{\partial S}}{{\partial z}}\frac{\partial }{{\partial z}}\left( {\delta z} \right)} \right)\delta z
    \]
    [/tex]

    But y(x) and z(x) do not depend on one another, so terms like [tex]\[
    {\frac{\partial }{{\partial y}}\left( {\delta z} \right)}
    \]
    [/tex] are obviously zero.

    And with terms like [tex]\[
    {\frac{\partial }{{\partial y}}\left( {\delta y} \right)}
    \]
    [/tex] we have that [tex]\[
    {\delta y}
    \]
    [/tex] is a function of x and is independent of y. So they go to zero as well. What is left is

    [tex]\[
    \delta \left( {\delta S} \right) = \frac{{\partial ^2 S}}{{\partial y^2 }}\left( {\delta y} \right)^2 + \frac{{\partial ^2 S}}{{\partial y\partial z}}\delta y\delta z + \frac{{\partial ^2 S}}{{\partial z\partial y}}\delta y\delta z + \frac{{\partial ^2 S}}{{\partial z^2 }}\left( {\delta z} \right)^2 = \frac{{\partial ^2 S}}{{\partial y^2 }}\left( {\delta y} \right)^2 + 2\frac{{\partial ^2 S}}{{\partial y\partial z}}\delta y\delta z + \frac{{\partial ^2 S}}{{\partial z^2 }}\left( {\delta z} \right)^2
    \]
    [/tex]

    But this last part was exactly how [tex]\[
    \delta ^2 S
    \]
    [/tex] was defined. So we have

    [tex]\[
    \delta \left( {\delta S} \right) = \delta ^2 S
    \]
    [/tex]

    Now if,

    [tex]\[
    S\left[ {x,y\left( x \right),z\left( x \right)} \right] = dF\left[ {x,y\left( x \right),z\left( x \right)} \right]
    \]
    [/tex]

    then,

    [tex]\[
    \Delta S\left[ {x,y\left( x \right),z\left( x \right)} \right] = S\left[ {x,y + \delta y,z + \delta z} \right] - S\left[ {x,y,z} \right] = \Delta dF\left[ {x,y\left( x \right),z\left( x \right)} \right]
    \]
    [/tex]

    [tex]\[
    = dF\left[ {x,y + \delta y,z + \delta z} \right] - dF\left[ {x,y\left( x \right),z\left( x \right)} \right] = d\left( {F\left[ {x,y + \delta y,z + \delta z} \right] - F\left[ {x,y\left( x \right),z\left( x \right)} \right]} \right) = d\Delta F\left[ {x,y\left( x \right),z\left( x \right)} \right]
    \]
    [/tex]

    So that,

    [tex]\[
    \Delta d = d\Delta
    \]
    [/tex]

    [tex]\[
    \Delta
    \]
    [/tex] commutes with [tex]\[
    d
    \]
    [/tex]

    Then,

    [tex]\[
    \Delta dF = \delta \left( {dF} \right) + \frac{1}{{2!}}\delta ^2 \left( {dF} \right) + \frac{1}{{3!}}\delta ^3 \left( {dF} \right) + ... = d\left( {\delta F + \frac{1}{{2!}}\delta ^2 F + \frac{1}{{3!}}\delta ^3 F + ...} \right) = d\delta F + \frac{1}{{2!}}d\delta ^2 F + ...
    \]
    [/tex]

    And keeping only the first approximation linear terms on both sides of the equation,

    [tex]\[
    \delta d = d\delta
    \]
    [/tex]

    [tex]\[
    \delta
    \]
    [/tex] commutes with [tex]\[
    d
    \]
    [/tex]

    And if

    [tex]\[
    S\left[ {x,y\left( x \right),z\left( x \right)} \right] = \int_a^b {L\left[ {x,y\left( x \right),z\left( x \right)} \right]} \,dx
    \]
    [/tex]

    Then

    [tex]\[
    \Delta S = S\left[ {x,y + \delta y,z + \delta z} \right] - S\left[ {x,y,z} \right] = \int_a^b {L\left[ {x,y + \delta y,z + \delta z} \right]} \,dx - \int_a^b {L\left[ {x,y,z} \right]} \,dx = \Delta \int_a^b {L\left[ {x,y,z} \right]} \,dx
    \]
    [/tex]

    [tex]\[
    = \int_a^b {L\left[ {x,y + \delta y,z + \delta z} \right] - L\left[ {x,y,z} \right]} \,dx = \int_a^b {\Delta L\left[ {x,y,z} \right]} \,dx
    \]
    [/tex]

    So we see here that [tex]\[
    \Delta
    \]
    [/tex] commutes with [tex]\[
    \int {}
    \]
    [/tex]

    But also,

    [tex]\[
    \Delta S = \delta S + \frac{1}{{2!}}\delta ^2 S + ... = \delta \left( {\int_a^b L \,dx} \right) + \frac{1}{{2!}}\delta ^2 \left( {\int_a^b L \,dx} \right) + ...
    \]
    [/tex]

    [tex]\[
    = \int_a^b {\Delta L} \,dx = \int_a^b {\delta L + \frac{1}{{2!}}\delta ^2 L + ...} \,dx = \int_a^b {\delta L\,dx + \frac{1}{{2!}}\int_a^b {\delta ^2 L\,dx + ...} }
    \]
    [/tex]

    Then keeping first approximation linear terms on each side of the equation we see

    [tex]\[
    \delta \int_a^b L \,dx = \int_a^b {\delta L\,dx}
    \]
    [/tex]

    Or, [tex]\[
    \delta \,\,\,{\rm{commutes}}\,\,{\rm{with}}\,\,\,\int {}
    \]
    [/tex]

    Many text use the integral definition of a functional, [tex]\[
    S = \int {L\,dx}
    \]
    [/tex] in its development. I suppose they do this because it makes it easier to justify taking only the linear terms inside the integral since differentials approach zero more naturally in the process of integration.

    It's easy to see that variation commutes with any number of integral signs since a difference outside the integral is translated to a difference inside the integral signs. Or,

    [tex]\[
    \Delta \int {\int {...\int {L\left[ {x,y,z} \right]} } } dx_1 dx_2 ...dx_n = \Delta \int {d[x]} L\left[ {x,y,z} \right]
    \]
    [/tex]

    [tex]\[
    = \int {d[x]} L\left[ {x,y + \delta y,z + \delta z} \right] - \int {d[x]} L\left[ {x,y,z} \right] = \int {d[x]\left( {L\left[ {x,y + \delta y,z + \delta z} \right] - L\left[ {x,y,z} \right]} \right)} = \int {d[x]\Delta L\left[ {x,y,z} \right]}
    \]
    [/tex]

    So following a similar procedure of keeping only linear terms,

    [tex]\[
    \delta \int {d[x]\,L = \int {d[x]\delta L} }
    \]
    [/tex]

    This means that the variation of the integration of the path integral gets passed inside all the infinite number of integrations to taking the variation of the exponent of the action.
     
    Last edited: Nov 20, 2009
  12. Nov 22, 2009 #11
    So what does it mean if this is not true? If the first variation of the integration of the path integral is not zero, then what does this mean? Feel free to offer your thoughts, but I think it means that the path integral is not a Dirac delta function, not a wavefunction or a transition function <x|x_0>. And if the path integral as a whole is not a delta function, then I think this means that one or more of the infinite number of <xi|xj> that are inside the path integral is not a delta function. Or at least it means that the infinite multiplication of the <xi|xj> is not an infinite multiplication of delta functions. And since the delta functions, <xi|xj> , are expressed in terms of an exponential whose exponents can be added up to equal the action integral of one exponential, at least this means the action type integral is not adding up right. So I can see how this would place some restrictions on the action integral, but I'm not sure what.
     
  13. Nov 22, 2009 #12
    Is there a reason why you're answering to your own posts?
    Just curious, that's all...
     
  14. Nov 23, 2009 #13

    Fra

    User Avatar

    Friend, I didn't respond because I felt I would just be repeating arguments I've made before and that doesn't convey anything.

    Since you like play around with formulas I'll just throw in random comments/critique instead regarding your elaborations.

    1. Your connection of dirac delta ang gaussian probability is a bit ambigous. Although the delta distribution can be defined by a construction involving the limit of a gaussian distribution, this choice of construction is not unique. Ie. you can find other families of functions that are not gaussan that are ALSO equally valid for defining the dirac distribution.

    So if we are really talking about a proper distribution, the dirac delta has IMO not much specifically to do with gaussian distributions.

    2. OTOH, if you think the gaussan really is the key, then the actual limiting process must contain physics. Which means we can not take the limit, the running of the n is something that has physical significance, do you agree? (If so, this is something that I am trying to do as well)

    3. Also the choice of decomposition or partitioning when working with an identity operator is also ambigous. Similarly, I argue that this choice and the size or the partitioning index has physical significance. How about that?

    4. I wont repeat any old argumetns but you think of induction in terms of counting sets: OK, if we take this view, then what I suggest that prevents a deductive scheme is that these sets are not given, they are changing. In particular can the SIZE of hte sets change. The size of the set represents the span of the counting index.

    I don't think we can not prove that our system is valid in a meaningful way. My point is that we don't need to do this. Nature is as far as I can see not an axiomatic construction.

    The old system is updated in the light of new input, as per a rationality rule. But this is not predictable in advance simply because the new input hasn't arrived yet. The feedback drives the selection and evolution.

    Anyway I think it would be interesting if you would consider the PHYSICS of the limiting procedure. And motivate it from there, rather than choosing at will a possible (but not unique) way of defining the dirac delta.

    /Fredrik
     
  15. Nov 23, 2009 #14
    It is a work in progress for me to understand these things. So I post what I think I understand as I come to know it so that someone can correct me if I'm wrong. That would be very much appreciated.
     
  16. Nov 23, 2009 #15
    Right, there are non-gaussian ways of describing the Dirac delta function. So why a gaussian, you ask? The gaussian distribution represents a totally random process. It's also called a bell curve or normal distribution. People in quality control use it to prove that there is nothing within their control to better maintain a tighter tolerance in their manufacturing processes. So the question is what in the development of the path integral would indicate the use of the gaussian form of the dirac delta function.

    The dirac delta function is used to describe the inner product of two states. But the path integral is constructed in such a way that the inner product of every two states of an infinite number of state is used. And it is constructed so that it comes out the same no matter how you label the states or in what order you take the products. So every inner product must have the same weight as every other, so it doesn't matter how you chose them. In other words, there is a completely random process involved with the inner product which is also described by the delta function. I believe that means we have to chose the gaussian form of the dirac delta function.

    But you may say that a random process of chosing an inner product does not mean that there's a random process in the product itself. Perhaps, but then again using the gaussian form of the inner product indicates that there is no further structure within the process of taking the inner product. If the best we can do in the development is to say that there is an inner product, but we inherently cannot possibly find anymore structural information than that, then I think that dictates the use of the gaussian form of the delta function. The gaussian form should be the default choice unless further structure can be discerned. But then again this added mathematical structure would have some spacetime dependence, being the only variables at that level. And this would probably create different weights for different inner products and ruin the covariant nature of the formulation. So I'm seeing here a glimps of a connection between covariance, no hidden variables, and the gaussian form of the dirac delta function for the inner product.

    I think the limiting process only indicates that a continum is involved.

    Yes, that's why I perfer to define the path integral in terms of the recursive relation of the dirac delta function and not the recursive relation of the identity operator. Because then, as you suggest, one has to justify the use of a decomposition into orthonormal vectors. Although, maybe it's just as obvious to some that the dirac delta is also described by the inner product of orthonormal vectors. But in my mind that takes a little more work.

    I don't know, you may have to take into account every sample (or state) in the space equally so that it may not matter which set it was in.

    Then what you are taking about is not deriving physics on principle alone, be it deductive or inductive, or whatever. Now what you are talking about is just using a trial and error method to find the best curve that fits that data, a curve fitting exercise. You're trying to discern fundamental physics by analyzing the underlying structure of the curves that fit the data. But that may change with better data. Obviously what is underlying any curve or equation is logic and counting.
     
    Last edited: Nov 23, 2009
  17. Nov 23, 2009 #16

    Fra

    User Avatar

    I'll pass going into argumenting details of my own thinking since it's clear enough that we have different strategies and we don't seem to connect but some convergence points might exists, which is why perhaps feedback from a different direction might still be useful fwiw?

    Maybe I was unclear, my point was that your line of reasoning is towards a decuctive system is not deductive (and I think you want to make deductive progress?). I see this as an inconsistency. I mean if your aim is a logical axiomatic system - the road to that can hardly be a deductive process, right? Maybe this doesn't bother you, but it looks weird from my perspective:

    Going from distributions, to a logicallt possible - but not unique choice - of familys of functions that can be used to defined the distribution as a limit, make further arguemtns based on that choice, and then take the limit again to get back at distributions seems like am ambigous argument to me. That was my main point.

    My point is not that the connection to statistics is uninteresting, but I see not deductive path to such a connection.

    Aldo, since not all random variables don't have gaussian distributions, but rather than the distribution of sampled means, converges to one as the sample size -> inf which is a difference, this I think calls for clarifications too. It seems to suggest the event space itself is emergent from an average process?

    This argument is not clear to me. In your deductive view, how you introduce the notion of inner product in the first place and what it's physical significance is? And how does this follow from logic?

    This thing with inner products and transition probabilities are usually part of what comes with the QM axioms or postulates such as projection and born postulates. But I thought you were going to find an alternative deduction.

    If we take a step back, what's the starting point in your quest? Are you taking some structure of QM for granted, if so which postulates or axioms are you trying to explain form the others?

    /Fredrik
     
  18. Nov 23, 2009 #17

    Fra

    User Avatar

    A short comment.

    There is a substantial difference between simple curve fitting, and adaptive learning such as biological evolution and human science is a good example of. You simply can not characterize in any sensible way, biological evolution as curve fitting. It's your characterisation of what I tried to say as basic curve fitting that most clearly illustrates that I totaly fail to convey much.

    /Fredrik
     
  19. Nov 23, 2009 #18
    You might remember from a previous post that using the identity an infinite number of times one can get,

    [tex]\[\left\langle {{x_F }} \mathrel{\left | {\vphantom {{x_F } {x_0 }}} \right. \kern-\nulldelimiterspace} {{x_0 }} \right\rangle = \int_{ - \infty }^{ + \infty } {\int_{ - \infty }^{ + \infty } {\int_{ - \infty }^{ + \infty } {...\int_{ - \infty }^{ + \infty } {\left\langle {{x_F }} \mathrel{\left | {\vphantom {{x_F } {x_1 }}} \right. \kern-\nulldelimiterspace} {{x_1 }} \right\rangle \left\langle {{x_1 }} \mathrel{\left | {\vphantom {{x_1 } {x_2 }}} \right. \kern-\nulldelimiterspace} {{x_2 }} \right\rangle \left\langle {{x_2 }} \mathrel{\left | {\vphantom {{x_2 } {x_3 }}} \right. \kern-\nulldelimiterspace} {{x_3 }} \right\rangle \left\langle {x_3 } \right|...\left| {x_n } \right\rangle \left\langle {{x_n }} \mathrel{\left | {\vphantom {{x_n } {x_0 }}} \right. \kern-\nulldelimiterspace} {{x_0 }} \right\rangle dx_1 dx_2 dx_3 ...dx_n } } } } \][/tex]

    and then that [tex]\[ < x|x_0 > \, = \,\,\delta (x - x_0 )\ \][/tex] This much you may recognize from text books.

    But it's easy to show from the recursive property of the dirac delta function that you can get the same result.

    You might remember that,

    [tex]\[
    \int_{ - \infty }^{ + \infty } {{\rm{f(x}}_1 {\rm{)\delta (x}}_1 {\rm{ - x}}_0 ){\rm{dx}}_1 } = {\rm{f(x}}_0 )
    \][/tex]

    So that if [tex]\[
    {\rm{f(x}}_1 ) = {\rm{\delta (x - x}}_1 )
    \]
    [/tex], then

    [tex]\[
    \int_{ - \infty }^{ + \infty } {{\rm{\delta (x - x}}_1 {\rm{)\delta (x}}_1 {\rm{ - x}}_0 ){\rm{dx}}_1 } = {\rm{\delta (x - x}}_0 )
    \]
    [/tex]

    So that this process could be iterated an infinite number of times to get,

    [tex]\[
    \int_{ - \infty }^{ + \infty } {\int_{ - \infty }^{ + \infty } {...\int_{ - \infty }^{ + \infty } {{\rm{\delta (x - x}}_n {\rm{)\delta (x}}_n {\rm{ - x}}_{n - 1} {\rm{)}}...{\rm{\delta (x}}_1 {\rm{ - x}}_0 ){\rm{dx}}_n {\rm{dx}}_{n - 1} ...{\rm{dx}}_1 } } } = {\rm{\delta (x - x}}_0 )
    \]
    [/tex]

    This is my starting point. But I wanted to show a similarity with the traditional path integral construction. You might remember how Ariel Caticha used the inner product [tex]\[ < x|x_0 > \,\][/tex] to represent implication. But I've seen more directly how implication is represented by the dirac delta function.

    In any event, the next step is to replace the dirac delta with a function. Most QM books go through a derivation using the schroedinger equation. I make the replacement directly, which is what you seem to be having trouble with. Wouldn't it be advantageous to derive the path integral from first principles and then derive schroedinger's equation from that instead of only the other way around? And don't you find it the least bit curious that replacing the delta with a gaussian gives you something that looks like QM?

    Yes, maybe I need to work on justifying the choice of a gaussian for the delta a bit more. Does it make sense to say that chosing a particular inner product (or dirac delta) must be a completely random process, that this means that the inner product must itself represent a random process (gaussian delta function)? The inner product is a combination of two things. So this is like asking if taking a combination of any two is a completely random process, then that combination represents a completely random process.

    But the main question of this thread is that given a path integral, does the least action principle follow from that just by taking the first variation of the integration of the path integral. I was trying to justify lagrangian mechanic from first principles. This is where the cannonical version of QM comes from, right?
     
    Last edited: Nov 23, 2009
  20. Nov 24, 2009 #19

    Fra

    User Avatar

    Yes this is from the definition of delta distribution.
    This would need clarification I think. The diract delta is technically not a function, it's a distribution(http://en.wikipedia.org/wiki/Distribution_(mathematics [Broken])). And it operates on test functions, and now you use a distribution as a test function, which needs clarification to be more well defined.

    More later, when I get more time to ttype.

    /Fredrik
     
    Last edited by a moderator: May 4, 2017
  21. Nov 24, 2009 #20

    Fra

    User Avatar

    Ok, Set aside some technical details, which is not the biggest issues here anyway
    - I don't see the physics in this starting point.

    Suppose we accept what you write this as kind of identity of your choice, then what predictive power does it yield?
    I read Ariels arxiv papers on his attempt to "derive" the born postulate from a number of premises and I think it's what yuo mean. I like some of Ariels things, but overall I have a similar critique to him as to you.

    His premises certainly does not follow deductively from an obvious starting point. They are possible but not unique starting points, and by no means plausible. If I remember corrently one key choice he makes is to introduce a complex number.

    To me, his choices are motivated by what he is trying to prove. So it's a possible reconstruction, but that IMO is no more plausible than the normal axiomatic introduction of QM.

    I think I need to understand your overall strategy before I can comment on details, to be constructive.

    (a) The QM formalism has several parts, first it's the general structure - hilbert spaces, operators, hamiltonians, projection postulates etc and how that supposedly connects to the notion of experimental probability.

    (b) The other part is the choice of hamiltonian for a given system.

    Which part are you trying to addressing and "replace from logic"?

    If you can infer the hamiltonian for a system, from some constraining principles of the formalism, yes that would definitely be interesting. But then I think the constraining principles must be unique. Consistency isn't enough if you claim that you are deducing this. Othrewise it's like a reconstruction, designed to prove something we already "know".

    I can appreciate your reflections but my personal feedback is that so far I find the reasoning to be weak.

    ... more later...

    /Fredrik
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: MUST the variation of the action be zero?
  1. Action as an observable (Replies: 27)

  2. Is action quantized? (Replies: 16)

  3. Quantization of action (Replies: 6)

Loading...