Differentials of order 2 or bigger that are equal to 0

Click For Summary

Discussion Overview

The discussion revolves around the mathematical concept of higher-order differentials and their behavior as they approach zero, particularly in the context of Poisson processes and Taylor series. Participants explore why terms of order greater than two can be considered negligible compared to first-order terms when dealing with infinitesimals.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • Some participants argue that as a variable approaches zero, higher powers of that variable (e.g., ##x^n## for ##n>1##) become negligible, allowing focus on first-order terms.
  • Others propose that the Taylor series provides a framework for understanding this behavior, suggesting that for well-behaved functions, higher-order terms can be ignored when the variable is sufficiently small.
  • A participant notes that while linear approximations are simpler and often sufficient, they may overlook important details captured by higher-order terms.
  • Another viewpoint suggests that the concept of infinitesimals may not be necessary for understanding Poisson processes, which can also be interpreted through counting processes and limits of Bernoulli processes.
  • One participant emphasizes the importance of recognizing that as a small parameter becomes smaller, its contribution to the overall expression increases relative to higher-order terms, which become negligible.
  • There is a discussion about the implications of using the term "infinitesimal" versus thinking in terms of arbitrarily small quantities, indicating a potential shift in perspective among participants.

Areas of Agreement / Disagreement

Participants express a range of views on the treatment of higher-order terms and the relevance of infinitesimals, with no clear consensus reached. Some agree on the utility of linear approximations, while others challenge the necessity of ignoring higher-order terms.

Contextual Notes

Limitations include assumptions about the behavior of functions and the conditions under which higher-order terms can be neglected. The discussion does not resolve the mathematical intricacies involved in these approximations.

Adgorn
Messages
133
Reaction score
19
So I've seen in several lectures and explanations the idea that when you have an equation containing a relation between certain expressions ##x## and ##y##, if the expression ##x## approaches 0 (and ##y## is scaled down accordingly) then any power of that expression bigger than 2 (##x^n## where ##n>1##) is equal to 0, leaving only the relation between the 1st order term ##x## and ##y##.

For example in a Poisson process the chance of arrival in a time interval ##δ## where ##δ→0## , is ##λδ## (where ##λ## is the arrival frequency). The chances of no arrivals during said interval is ##1-λδ## and the chance of 2 arrivals or more is 0, because the chance of getting n arrivals in the interval ##δ## is ##(λδ)^n=λ^nδ^n## and ##δ^n=0## for ##n>1## when ##δ→0##.

Now in the basic intuitive sense I can understand why this is the case, if a variable ##x## approaches 0 then the variable ##x^2## (or ##x^n## where n>1) becomes negligibly small, and it becomes more and more negligible as ##x## becomes smaller and smaller. The thing is we are already dealing with infinitesimals in cases like the Poisson process, so why do we decide that ##x## is not negligible and ##x^2## is when both are arbitrarily small?

I guess I'm asking for a mathematical basis for this claim, I'm sure there is one since it is so confidently used in many fields in math and physics.

Thanks in advance to all the helpers.
 
Last edited:
Physics news on Phys.org
If you take a Taylor series:

##f(x) = \sum_{n= 0}^{\infty}\frac{f^{(n)}(x_0)(x-x_o)^n}{n!} = f(x_0) + (x-x_0)f'(x_0) + \frac{(x-x_0)^2 f''(x_0)}{2} + \dots##

Then, if we assume the function ##f## is well-behaved - in the sense that all its derivatives are bounded - we have:

##f(x) \approx f(x_0) + (x-x_0)f'(x_0)## (when ##x-x_0 << 1##)

You can see that there will be exceptions to this for functions where the ##n^{th}## derivatives are unbounded, but for the sort of functions normally considered in physics this is not an issue.
 
Last edited:
  • Like
Likes   Reactions: Adgorn
In general, having multiple representations of the same phenomenon, can be quite helpful, so with that in mind, I wrote the below.
- - - - -

note: if you have a function that is twice differentiable, you can write it as a Taylor Polynomial with a quadratic remainder (I'd suggest using Lagrange form). Being quadratic, the remainder is ##O(n^2)##.

The linear approximation gives you the probability of one arrival. (In my view, the probability of zero arrivals is just an after thought -- it is the complement of total probability of at least one arrival). Your question really is: why is a linear approximation the best over a small enough neighborhood for a function that is (at least) twice differentiable.
- - - -
In Poisson process language: why is the linear approximation of the probability of positive arrivals (i.e. approximating it by looking at probability of only 1 arrival) arbitrarily close to the actual total probability of positive arrivals, in some small enough time neighborhood?

- - - -
Frequently fully worked examples help people a lot. So a more granular view is:

specific to the exponential function (whose re-scaled power series gives you the Poisson PMF), you may recall that one way to prove the series for the exponential function is absolutely convergent involves invoking a geometric series after a finite number of terms (to upper bound the remaining infinite series).

In the nice case of ##0 \lt \delta \lt 1##, you have

## \delta \leq \exp\big(\delta\big) -1 = \delta + \big(\frac{\delta^2}{2!} +\frac{\delta^3}{3!} + \frac{\delta^4}{4!}+ ... \big) \leq \delta +\delta^2 + \delta^3 + \delta^4 + ... = g(\delta) = \frac{\delta}{1 - \delta} ##

now consider that for small enough ##\delta##, we have ##\frac{\delta}{1 - \delta} \approx \delta##. Play around with some numbers and confirm this for yourself. E.g. what about ##\delta = \frac{1}{100,000}##? This is a small number, but hardly "infinitesimal".

If you want to have some fun with it, consider what portion of the geometric series is represented by ##\delta##. I.e. look at

## \frac{\delta}{\big(\delta + \delta^2 + \delta^3 + \delta^4 + ... \big)} = \frac{\delta }{\big(\frac{\delta}{1-\delta}\big)} = 1 - \delta##

This is why, when you look at ##\delta = \frac{1}{100,000}##,

99.999% of the value of the ##g(\delta)## is in the very first term of the series.

Taking advantage of non-negativity, we can see that since the upper bound is well approximated by ##\delta##, when ##\delta## is small enough, and since the ##\Big(\exp\big(\delta\big) -1\Big)## contains that term, it must be approximated by it as well. Put differently, we see that the value of ##\big(\frac{\delta^2}{2!} +\frac{\delta^3}{3!} + \frac{\delta^4}{4!}+ ...\big)## is irrelevant, for small enough ##\delta## .

Put one more way: you tell me what your cut off is / level of precision you want, and I can come up with a small enough real valued ##\delta ## such that you can ignore all those higher, ##O(n^2)##, terms. If you keep asking for ever finer levels of precision, this back and forth eventually results in a limit, but the idea of getting an extremely good approximation is the main idea here.

Adgorn said:
The thing is we are already dealing with infinitesimals in cases like the Poisson process, so why do we decide that ##x## is not negligible and ##x^2## is when both are arbitrarily small?

I guess I'm asking for a mathematical basis for this claim

This isn't really true. A poisson process can be thought of as a limiting case of a shrinking Bernouli process. It can also be thought of as a counting process with exponentially distributed inter-arrival times. Infinitessimals aren't needed. While the limit of a bernouli is a good interpretation, don't over think it... the counting process interpretation can be quite enlightening.
 
  • Like
Likes   Reactions: Adgorn
Linear approximations are so much easier to deal with than the higher order approximations, that it is worth considering using it as an estimate. It will give you the value of a function and tell you how the function changes locally in each direction. That is often enough. And the theory of linear functions (simple, simultanious, or multivariable) is reasonably deep and informative. Going one step higher to quadratic approximations opens a real can of worms.
 
  • Like
Likes   Reactions: Adgorn
StoneTemplePython said:
In general, having multiple representations of the same phenomenon, can be quite helpful, so with that in mind, I wrote the below.

Put in simple and general terms, given an expression containing both ##\delta## and higher orders of ##\delta##, as ##\delta## becomes arbitrarily small, the ##\delta## portion takes up more and more "weight" of the overall expression and the contribution of the higher order terms to the value of the expression become negligible, and thus can be ignored.

Also I guess I should be careful with throwing the expression "infinitesimal" around and start thinking about calculus as a whole more in terms of arbitrarily small quantities instead of infinitesimal ones.

Thanks for the help!
 
  • Like
Likes   Reactions: StoneTemplePython
Adgorn said:
So I've seen in several lectures and explanations the idea that when you have an equation containing a relation between certain expressions ##x## and ##y##, if the expression ##x## approaches 0 (and ##y## is scaled down accordingly) then any power of that expression bigger than 2 (##x^n## where ##n>1##) is equal to 0, leaving only the relation between the 1st order term ##x## and ##y##.

Suppose we are trying to find ##lim_{\delta \rightarrow a} f(g(\delta)) ##. If ##lim_{\delta \rightarrow a} g(\delta) = L## then for a continuous function ##f##, this limit is equal to ## f(L)##. If ##h(\delta)## is a function different than ##g## but also having ##lim_{\delta \rightarrow a} h(\delta) = L## then we could compute the same limit as ##lim_{\delta \rightarrow a} f(h(\delta))##.

You described a situation where ##a = 0## and ##g(\delta)## is some function of the functions ##x(\delta)## and ##y(\delta)##. In particular, ##g## is a polynomial in the variable ##y(\delta)## (which may or may not have coefficients involving functions of ##x(\delta)##). The argument that we can replace ##g## by a simpler function ##h(\delta)## that sets the higher powers of ##y(\delta)## to zero depends on showing that ##lim_{\delta \rightarrow 0} g(\delta) = lim_{\delta \rightarrow 0} h(\delta)##. This is not true as a generality. However, the examples you have seen in books are presumably special cases where the two limits are equal.
For example in a Poisson process the chance of arrival in a time interval ##δ## where ##δ→0## , is ##λδ## (where ##λ## is the arrival frequency). The chances of no arrivals during said interval is ##1-λδ## and the chance of 2 arrivals or more is 0, because the chance of getting n arrivals in the interval ##δ## is ##(λδ)^n=λ^nδ^n## and ##δ^n=0## for ##n>1## when ##δ→0##.
However ##\lambda \delta## also "=0" when ##\delta \rightarrow 0##, so it isn't clear what that line of reasoning shows - except that the probability of 1,2,3,.. arrivals approaches 0 as ##\delta## approaches 0. Perhaps you are quoting part of an argument that concerns the value of the intensity parameter of a process composed of two independent Poission processes that happen simultaneously.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 16 ·
Replies
16
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 24 ·
Replies
24
Views
4K
  • · Replies 1 ·
Replies
1
Views
1K