# Fermat's principle and refraction -- Possible misunderstanding

• B
lemma28
Consider a light starting at A in media 1 and going in and out a media 2 (say shaped as a disk) with relative index of refraction n to arrive at point B (in media 1).
Fermat's principle says that the path taken by the ray between points A and B is the path that can be traversed in the least time.
That could be misunderstood by thinking that the "least time" path from A to B and through the disk is the path taken by the light ray, and the one following Snell's law at the surfaces separating the two media.
That's not true. Actually there are paths (orange broken line) that do not follow Snell's law that produce a smaller time than the one produced by the path following the Snell's law (blue broken line). Actually Fermat's principle says that the time of the path taken by light (following the Snell's law) is a local minimum with respect to nearby paths.
That doesn't prevent the fact that there could be a global minimum for the time following a different path.
I'm posting this note because I was myself victim of the misunderstanding.

(in the simulation n=1.8, A(-1.6, 0), B(1.3, 0.3), r=1)

Last edited:
• Delta2

Homework Helper
Gold Member
2022 Award
Yes, local minimum within the constraints.

• lemma28
Gold Member
I always understood that you can apply Fermat’s principle on many different paths through a lens or curved mirror where the focusing happens. Ray tracing gives you all those paths.

Motore
Actually there are paths (orange broken line) that do not follow Snell's law
For linear, homogeneous, and isotropic (LHI) media all rays follow Snell's law, or am I missing something?

• lemma28 and sophiecentaur
Gold Member
For linear, homogeneous, and isotropic (LHI) media all rays follow Snell's law, or am I missing something?
I'd agree with you. The OP refers to a simulation but that may not in fact be accurate. Ray tracing is easy in principle but the geometry can sometimes be carried out wrong. In the case of a circular cross section cylinder, the rays do not all focus at one point - only for small angles- (see spherical aberration etc.)
I'd never rely on a simulation without knowing its pedigree.

• lemma28
lemma28
For linear, homogeneous, and isotropic (LHI) media all rays follow Snell's law, or am I missing something?
The problem (and the misunderstanding) is that all actual rays follows the Snell's law and they produce a local minimum for the path's time, considering nearby paths. But that doesn't mean that an allowed path (that doesn't follow Snell's law but nonetheless goes from A to B) is the path followed by the light ray.
The exceptions might arise when the separating surface between the media is curved, introducing non linearity. It doesn't occurs for a prism or a rectangular slab.

Gold Member
The exceptions might arise when the separating surface between the media is curved,
Would there be any confusion if we avoided trying to make Fermat's principle apply during 'failed' focussing of an image? If it's valid to use a polygonal approximation for a curved surface then Snell's Law can be applied to all paths through the surface. It's only for 'certain' shapes of surface that Snell's Law works to produce a common point of arrival for all paths (i.e. a sharp image). The "exceptions" would be the majority. Diffraction has to come in somewhere but I don't think Fermat considered that so this seems just like a ray tracing problem.

OR is there more to this question that I just haven't grasped? It wouldn't be the first time.

• lemma28
lemma28
Would there be any confusion if we avoided trying to make Fermat's principle apply during 'failed' focussing of an image? If it's valid to use a polygonal approximation for a curved surface then Snell's Law can be applied to all paths through the surface. It's only for 'certain' shapes of surface that Snell's Law works to produce a common point of arrival for all paths (i.e. a sharp image). The "exceptions" would be the majority. Diffraction has to come in somewhere but I don't think Fermat considered that so this seems just like a ray tracing problem.

OR is there more to this question that I just haven't grasped? It wouldn't be the first time.
I'm not much into optics but I guess we are saying the same thing from different perspectives (focusing and global/local minimum...)

Homework Helper
2022 Award
A path of zero "width" will carry zero flux. In order to create a finite flux some "pencil" of rays must add, so nearby paths must also contribute in a concerted way to be important.
This is similar to the very real difficulties encountered in using Feynman's path integrals for practical quantum mechanics. For the extraction of Schrodinger Equation from Feynman's formulation is not straightforward.

• lemma28 and sophiecentaur
Gold Member
I'm not much into optics but I guess we are saying the same thing from different perspectives (focusing and global/local minimum...)
Well that’s a relief. 🙂
It’s a matter of phase coherence over a sphere around the same point of incidence. When it’s reflectors we’re dealing with, we can do it with a piece of string.
A path of zero "width" will carry zero flux. In order to create a finite flux some "pencil" of rays must add, so nearby paths must also contribute in a concerted way to be important.
This is similar to the very real difficulties encountered in using Feynman's path integrals for practical quantum mechanics. For the extraction of Schrodinger Equation from Feynman's formulation is not straightforward.
Absolutely; totally agree. That would be presumably beyond Fermat's ideas, though.

• • lemma28 and hutchphd
Philip Koeck
I have a related question.
First I want to consider an ideal focusing lens.
All rays coming from one point in the object plane will intersect in one point of the image plane.
I'm assuming that the object is placed such that we get a real image.
In this case all rays have the same optical path length (same number of wavelengths) between the object point where they originate and the image point where they all intersect.
In other words a diverging spherical wave coming from the object point would be transformed into a converging spherical wave the converges onto the image point.
Is that right so far?

Now I want to consider a lens with 2 convex spherical surfaces, so it has positive spherical aberration.
I want to look at two rays coming from the same point on the optical axis, one of which actually follows the optical axis whereas the other ray has a relatively large angle to the optical axis so that it passes through the lens quite far from the center.
These two rays will intersect on the optical axis somewhere behind the lens.

Will these two rays have the same optical path length between their point of origin and the point where they intersect again?

Homework Helper
2022 Award
Not really.
All paths from point A to poiint B do not take the same time. But a path taken is the one that is a minimum time relative to adjoining paths. So the classical ray follows a path where the total accrued phase from A to B is stationary with respect to small variations in the path. For the on-axis examples you cite it is clear (I think) that these will be the symmetric paths for each case.
This is the same formulation Feynman used to describe massive particles but the "phase" is the classical action. The true quantum particle samples all paths and the paths of stationary phase predominate.
EDIT: I will anticipate your next logical question. No these paths are not always unique and there can be more than one local minimum. In these cases (two slits for instance) we can see the machinery at work.

• Philip Koeck
Philip Koeck
Not really.
All paths from point A to point B do not take the same time. But a path taken is the one that is a minimum time relative to adjoining paths. So the classical ray follows a path where the total accrued phase from A to B is stationary with respect to small variations in the path.
I'm trying to save a derivation I wrote about a decade ago and then discarded because I started thinking that I was overusing Fermat's principle.

If you take into account that there is a smoothness in this case without any sudden changes.
All rays at very small angles essentially intersect in the ideal image plane.
Then as the angle increases the point of intersection with the central beam moves away from the ideal image plane smoothly.
So there is a smooth sequence of local minima.

• hutchphd
Homework Helper
2022 Award
Let A be a point object and B be its image in the image plane.
If the lens were replaced by a flat piece of glass there would bo only one classical ray from A to B and it would obey Fermat locally statioary .
The "perfect" lens is a very particular shape designed to have just the feature that is confusing. For each poin A in the object plane and each corresponding point B in the image plane there are a continuous multitude of classical rays that define the image and are perfectly good classical rays. Each one of them is individually stationary with respect to arbitrary path variations. As you have pointed out, morphing one allowed path into another allowed path can be done continuously but not arbitrarilly. This is a very restricted operation for the lens
Does this get at your question?

Philip Koeck
Yes, maybe,
We could actually start with an ideal lens. Let's say a spherical wave comes from a point scatterer in point A. The ideal lens will produce a converging spherical wave which converges onto point B. That means all "rays" intersect in point B in phase with each other, since they emerged from point A in phase.
For simplicity we'll place point A and B on the optical axis.

If we now deform the lens so it becomes more spherical the image point in B will stretch out to a line along the optical axis since the focusing power of the lens now depends on the scattering angle, the larger the angle the stronger the focusing effect.

Clearly with such a lens only rays with the same angle will intersect on the optical axis and they will also intersect the single ray that is exactly on the optical axis, but they won't intersect any other rays on the optical axis.

Could it be that even in this case the cone of rays with a given angle and the ray on the optical axis are in phase when they intersect?

Last edited:
Homework Helper
2022 Award
Certainly the result will be cylindrically symmetric and so any particular cone surface should have similar phase. The other interesting case may be the exactly symmetric case with object and image each at 2f.
Do you have access to J.W.Goodman Fourier Optics ? As I recall he looks at this very question early on..it is a good book

• jim mcnamara, vanhees71 and Philip Koeck
Philip Koeck
Certainly the result will be cylindrically symmetric and so any particular cone surface should have similar phase. The other interesting case may be the exactly symmetric case with object and image each at 2f.
Do you have access to J.W.Goodman Fourier Optics ? As I recall he looks at this very question early on..it is a good book
It's a long time since I read Goodman. Have to have another look.

I would also say that all rays on a particular cone are in phase with each other when they intersect on the optical axis behind the lens (for symmetry reasons).
My question is whether they also are in phase with the central ray (the one on the optical axis) in the point of intersection.

Here's a picture from the internet (with an object point infinitely far in front of the lens): https://www.cyberphysics.co.uk/graphics/diagrams/medical/aberration_spherical.png

Here's a picture showing the wave fronts:

It looks like the answer to my question should be "yes" if we can assume that the wave fronts are continuous close to the optical axis.

Gold Member
My question is whether they also are in phase with the central ray (the one on the optical axis) in the point of intersection.
There's a logical objection to that happening: if rays from one cylinder were in phase with the direct ray then other rays, from different cylinders would also be in phase. That would imply all rays would be in phase and there would be no spherical aberration.

Philip Koeck
There's a logical objection to that happening: if rays from one cylinder were in phase with the direct ray then other rays, from different cylinders would also be in phase. That would imply all rays would be in phase and there would be no spherical aberration.
Rays from different cylinders on the left of the lens continue in different cones on the right of the lens. They don't intersect in the same point on the optical axis because of spherical aberration.
To me that means they are clearly not in phase with each other and they don't need to be either.

I only want to know whether the rays in one cone are in phase with the direct ray at the point of intersection.
I don't see that a positive answer would imply that all cones are in phase with each other.
You have to take into account that different cones don't even intersect in the same point, so in which point would you measure their phase-relation.

Philip Koeck
Rays from different cylinders on the left of the lens continue in different cones on the right of the lens. They don't intersect in the same point on the optical axis because of spherical aberration.
To me that means they are clearly not in phase with each other and they don't need to be either.

I only want to know whether the rays in one cone are in phase with the direct ray at the point of intersection.
I don't see that a positive answer would imply that all cones are in phase with each other.
You have to take into account that different cones don't even intersect in the same point, so in which point would you measure their phase-relation.
Another thought: Two different cones intersect in a ring. According to my thinking (if I'm right) these two cones would be in phase on that ring.

Gold Member
I only want to know whether the rays in one cone are in phase with the direct ray at the point of intersection.
I don't see that a positive answer would imply that all cones are in phase with each other.
If the direct ray is in phase with one cone then wouldn't it be in phase with any other cone? There is only one direct ray so its phase is the same in all cases.
IF a=b and a=c then b=c.
Doesn't that apply here? Are you or am I missing something?

Gold Member
Are you or am I missing something?
I have been assuming that you are discussing a situation where there is SA and that you are 'implying' that the direct ray is in phase with light coming along any cone. Only with a paraboloid will that occur and a paraboloid eliminates SA.

Philip Koeck
I have been assuming that you are discussing a situation where there is SA and that you are 'implying' that the direct ray is in phase with light coming along any cone. Only with a paraboloid will that occur and a paraboloid eliminates SA.
Yes, I'm assuming there is spherical aberration.

The direct ray is in phase with exactly one cone in the point where this cone intersects the optical axis.
In another point somewhere further along the optical axis the direct ray is in phase with another cone.
In general, at any point in time the direct ray has two different phase angles in these two different points.

I agree there is only one direct ray, but in general there is a phase difference between two different points along the optical axis, for example if the distance between the points is λ/2 then the phase difference is π.

Philip Koeck
I have been assuming that you are discussing a situation where there is SA and that you are 'implying' that the direct ray is in phase with light coming along any cone. Only with a paraboloid will that occur and a paraboloid eliminates SA.
We might differ in what we mean by "in phase".
To my way of thinking one has to always specify where two waves are in phase.
For example two arbitrary plane waves with equal wave lengths in 3D space are exactly in phase on infinitely many straight lines, but they are exactly out of phase on other straight lines.
The only waves that are in phase with each other everywhere are identical waves, but then there's really only one wave.

Homework Helper
2022 Award
I only want to know whether the rays in one cone are in phase with the direct ray at the point of intersection.
I think I agree with your analysis. Presumably I don't need to mention the similarity of this analysis to that of phase-contrast microscopy where the phase shifts are generated by the (transparent) object and annular rings are used for phase selection.
What are you contemplating here? It sounds interesting!

Philip Koeck
Yes, the lens aberrations work very much like a phase plate, just not in such a tidy way.
They produce a phase shift for all scattered waves that is a relatively complicated function of the scattering angle (the lens aberration function).
In transmission electron microscopy of almost pure phase objects you use this property to generate contrast. The objective lens has a fixed spherical aberration and then you can select the defocus that you want to enhance the scattering angles (connected to spatial frequencies in the object) that you are interested in.
I just wanted to derive the lens aberration function in a simplified version containing only defocus and spherical aberration without having to refer to general aberration theory (for the sake of the students).
I've put my derivation on ResearchGate now. I think it makes sense.
Here it is: https://www.researchgate.net/publication/358235022_The_Lens_Aberration_Function

• hutchphd
Gold Member
In general, at any point in time the direct ray has two different phase angles in these two different points.
That goes for any wave. You do not have to use any particular point on your direct ray. All you need to do is to measure the phase difference between direct ray and the rays on the cone.
There is only one 'direct ray' whichever other set of rays you are considering at the time. It would make sense to take the direct ray as a phase reference and make the phase comparison between that ray and all incident cones of rays.

There would be some practical issues making a true phase difference measurement as the number of wavelengths involved could be large. But the principle remains valid. It's a common problem with interferometry but several methods would be available (based on counting zeros as the cone gets wider and wider from the axis, for instance). The reference would be the same reference throughout.

• Philip Koeck
Homework Helper
2022 Award
I just wanted to derive the lens aberration function in a simplified version containing only defocus and spherical aberration without having to refer to general aberration theory (for the sake of the students).
I first came upon this stuff grinding a telescope mirror when I was in high school. The use of knife edge interferometry (old razor blades and a big paper clip in my case) was as close to a miracle as anything I had ever seen. Pedagogical Gold.

• Philip Koeck
Philip Koeck
You do not have to use any particular point on your direct ray. All you need to do is to measure the phase difference between direct ray and the rays on the cone.
There is only one 'direct ray' whichever other set of rays you are considering at the time. It would make sense to take the direct ray as a phase reference and make the phase comparison between that ray and all incident cones of rays.
I still can't follow this. Where on the direct ray would you measure the phase reference?
The way I understand phase, the direct ray (or any other wave for that matter) as a whole can't be a phase reference. A wave has all possible phases in different places and at different times.
Maybe we're just talking about two different things.

Gold Member
Where on the direct ray would you measure the phase reference?
You can measure it at any point on the axis and use the same point to compare phases of all the different cones. The variation will approach zero as the cones get closer to the axis. This approach to phase measurement applies to all wave measurements of all types of wave. It's the same principle as in surveying ; you don't have to take a reference height at sea level- you just choose some convenient reference location. It will still give you the right topography.

Philip Koeck
You can measure it at any point on the axis and use the same point to compare phases of all the different cones. The variation will approach zero as the cones get closer to the axis. This approach to phase measurement applies to all wave measurements of all types of wave. It's the same principle as in surveying ; you don't have to take a reference height at sea level- you just choose some convenient reference location. It will still give you the right topography.
I'm trying to find out if I've missed something important.

Let's use the phase of the direct ray in the ideal image point as a reference, for example. At a specific time we choose the phase is zero, let's say.
Then a cone with a very small angle will have a phase very close to zero in a point on the axis very close to the ideal image point.
Do you agree so far?

What would the phase of a cone with large angle be in the point of intersection with the optical axis if this point is a distance λ/2 away from the ideal image point?

We're always assuming that all the rays come from a point-like object in phase, so they form a divergent spherical wave to the left of the lens.

Gold Member
Let's use the phase of the direct ray in the ideal image point as a reference, for example. At a specific time we choose the phase is zero, let's say.
The 'ideal image point' doesn't actually have any meaning for the direct ray because the direct ray passes through all points on the axis. It's just not necessary to chose any particular 'point'; you just choose an arbitrary one, say in the plane of one of the points on the axis where a chosen cone has its vertex.

There is also no need to choose a particular time because what counts is the phase differences. I can't think of a situation where the timing of a light wave can be actually measured to within one actual cycle of the wave. No one uses absolute time when describing or calculating the result of Young's Slits experiment - just relative phase.
What would the phase of a cone with large angle be in the point of intersection with the optical axis if this point is a distance λ/2 away from the ideal image point?
The phase will be + or - 180 degrees relative to the image point. If it were zero, it would coincide but it's 180 degrees early or late because of the different optical path length.

I think you may find it easier once you have decided to consider relative phases and nothing absolute. In practice, I think you would probably choose the reference plane as being where you get the 'best' image, with the least aberration of the image. (Median position of apexes of cones, perhaps.)

• hutchphd and Philip Koeck
Philip Koeck
The phase will be + or - 180 degrees relative to the image point. If it were zero, it would coincide but it's 180 degrees early or late because of the different optical path length.
Okay. I think we completely agree. It looks like my assumption about Fermat's principle in the case of a lens with spherical aberration might be correct after all.

Thanks!

Philip Koeck
An observation and a question:
Rays are always orthogonal to wave fronts (I was told).
In the case of a lens with spherical aberrations there is an extended area where rays intersect and can't be orthogonal to the same wave front because they are oriented in different directions. (In the case of an ideal lens this only happens in a single point.)
See: https://www.cyberphysics.co.uk/graphics/diagrams/medical/aberration_spherical.png

I get the impression that the concept of rays sort of breaks down in this area around the intersections of the light cones, but are there still well defined wave fronts in this area?

• 