Applying a constraint in the calculus of variations

Philip Koeck · Jan 11, 2019

I have an analytical function F of the discrete variables n_i, which are natural numbers. I also know that the sum of all n_i is constant and equal to N.
N also appears explicitly in F, but F is not a function of N. F exists in a coordinate system given by the n_i only.
Should I carry out the variation as if N would vary when I vary any of the n_i and then apply the constant N as a constraint with a Lagrange multiplier or is it correct to leave out the variation of N with n_i from the beginning.
As an example you can look at: F = N + ∑g_i ln n_i
The g_i are just weights, which can be different for every i.

stevendaryl · Jan 11, 2019

Lagrange multipliers are explicitly for the sort of problem you're talking about. However, if the constraint is that ##\sum_j n_j = N##, then it leads to a boring result: ##\frac{\partial F}{\partial n_i} = \lambda## (where ##\lambda## is the lagrange multiplier, some constant).

However, if the ##n_j## are supposed to all be natural numbers, then taking partial derivatives isn't going to minimize ##F##, because the values of ##n_j## that minimize ##F## might not be natural numbers. I suppose you could use that answer as a starting place, and then search nearby for integer values that minimize ##F##?

Philip Koeck · Jan 11, 2019

Which of the following two solutions is correct?
1 + g_i/n_i - λ = 0
or g_i/n_i - λ = 0

stevendaryl · Jan 11, 2019

Philip Koeck said:

Which of the following two solutions is correct?
1 + g_i/n_i - λ = 0
or g_i/n_i - λ = 0

Since ##\lambda## isn't a fixed number, there is no difference which of those you use. They lead to the same answer for ##n_i##, once you eliminate ##\lambda##.

Philip Koeck · Jan 12, 2019

I'm still mystified.
Can we look at the actual problem instead? A bit more complicated, I'm afraid.
I want to maximize F given below under the constraints ∑ n_i = N and ∑ n_i u_i = U, with constant N and U. I'll use α and β for the multipliers. The g_i and u_i are given parameters.

F = N ln N - N + ∑ ( n_i ln g_i - n_i ln n_i + n_i )

Are both the following solutions correct, would you say?

ln N + ln g_i - ln n_i - α - β u_i = 0

ln g_i - ln n_i - α - β u_i = 0

BvU · Jan 12, 2019

Looks like you are trying to derive the Boltzmann distribution. Google is your friend.

stevendaryl · Jan 12, 2019

Philip Koeck said:

I'm still mystified.
Can we look at the actual problem instead? A bit more complicated, I'm afraid.
I want to maximize F given below under the constraints ∑ n_i = N and ∑ n_i u_i = U, with constant N and U. I'll use α and β for the multipliers. The g_i and u_i are given parameters.

F = N ln N - N + ∑ ( n_i ln g_i - n_i ln n_i + n_i )

Are both the following solutions correct, would you say?

ln N + ln g_i - ln n_i - α - β u_i = 0

ln g_i - ln n_i - α - β u_i = 0

Those lead to the exact same answers for ##n_i##. They lead to different values for ##\alpha##, but you don't care about the value of ##\alpha##.

Philip Koeck · Jan 12, 2019

stevendaryl said:

Those lead to the exact same answers for ##n_i##. They lead to different values for ##\alpha##, but you don't care about the value of ##\alpha##.

Isn't α given by ∂F/∂N ? How can it be different for the two solutions?
Shouldn't it be completely defined by F?
Yes I am looking at the derivation of Boltzmann statistics, but without the correction for indistinguishability.
That's why F contains two terms that depend only on N.

stevendaryl · Jan 12, 2019

Philip Koeck said:

Isn't α given by ∂F/∂N ? How can it be different for the two solutions?
Shouldn't it be completely defined by F?

Do you agree that the two solutions lead to the same values for ##n_i##?

In both cases, the solution is: ##n_i = N g_i e^{-\beta u_i}/\sum_i (g_i e^{-\beta u_i})##

The two different values for ##\alpha## differ by ##\ln N##.

What's important is ##n_i##, not ##\alpha##.

Philip Koeck · Jan 12, 2019

stevendaryl said:

Do you agree that the two solutions lead to the same values for ##n_i##?

In both cases, the solution is: ##n_i = N g_i e^{-\beta u_i}/\sum_i (g_i e^{-\beta u_i})##

The two different values for ##\alpha## differ by ##\ln N##.

What's important is ##n_i##, not ##\alpha##.

I would write the solutions as follows:
n_i = N g_i exp(- α - β u_i)
and
n_i = g_i exp(- α - β u_i)
I see that you remove the α from the solutions by introducing the normalization sum.

I agree that if the α differ by ln N the two results are the same, but how do you argue that the α should be different?

Applying a constraint in the calculus of variations

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad Finding the minimum distance between two curves

Undergrad Why ##a^0=1##?

Undergrad Proving that convexity implies second order derivative being positive

High School Straightforward integration…

High School Arc Length for Hyperbolic Sin

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect