Fermat’s Principle

Fermat’s Principle, stated as, “Of all the paths light might take between two points, the actual path taken is the one that requires the least time“, may seem to be a simple and logical result, but let’s think about what this actually means. If I have a light source directed at a mirror, I can figure out where it will reflect to by determining which path takes the least time. This is a big result, because there a lot of probable paths from one point to another; the light just somehow knows which one will take the least time. How does the light “know”, may you ask? Well, Feynman makes his argument on this point by stating that if there are multiple paths available for light to take, then if neighboring paths take longer then they will cancel each other out and we would not see their reflection; for neighboring paths that take a similar amount of time, the path length is reinforced. For those who are interested in further answers to this question, I would suggest you read Feynman’s short book “QED”, in which he explains this in more detail and with more clever humor than I can. For now though, we will just prepare to be dazzled by nature and move on with deriving Fermat’s principle mathematically.

Table of Contents:

  1. Derivation
  2. Fermat and Optical Imaging
  3. Generalized Fermat’s Principle


The time required to go from point “A” to point “B” in space is given by:

\begin{equation} t = \int_A^B dt \end{equation}

where clearly the integral is being computed over a path from “A” to “B”. Now, in order to relate the time to the path length “s”, we can multiply the differential time element by “c”, the speed of light. You will notice then we have the same units. We then have the relation:

\begin{equation} c{dt} = n{ds} \end{equation}

so that the integral of each side is:

\begin{equation} c \int_A^B dt = \int_A^B n ds \end{equation}

where we have the relationship that we might have expected: the path of least time corresponds to the path of the shortest optical path, and vice versa. Therefore, to make our lives easier by giving us better grounds for geometrical arguments as opposed to temporal ones, we will consider the integral over the optical path length which in turn corresponds to the shortest time path. To start this, we use the eikonal equation:

\begin{equation} \nabla{S} = n\hat{s} \end{equation}

Then, in order to manipulate the eikonal equation to be in the proper form for Stoke’s theorem, we can use the fact that the curl of a gradient is zero, so:

\begin{equation} \nabla \times (\nabla{S}) =\nabla \times (n\hat{s}) = 0 \end{equation}

Then, again to put the equation further into the correct formulation for Stoke’s theorem which will allow us to integrate over a curve, we can integrate this over any area (open surface) as follows:

\begin{equation} \iint_{A= dxdy\hat{s}} \nabla \times (n\hat{s})\cdot{dx dy \hat{s}} = 0 \end{equation}

Finally, we can use Stoke’s theorem to find the equivalent integral over the closed curve:

\begin{equation} \iint_A\nabla\times(n\hat{s})\cdot{da} = \oint_C n\hat{s}\cdot{dr} = 0 \end{equation}

where C is the closed curve bounding our area. Now, we have done all of this work because the above result is known as Lagrange’s integral invariant, which basically provides a means to check the contributions of neighboring paths. Like I said at the beginning, if the neighboring paths are different, then random deviations to the shortest path will yield a much different result than if the neighboring paths are very similar. Using this principle, we can derive Fermat’s principle using the below figure for two “neighboring” paths and the following steps:

  1. Consider a ray that propagates from point $P_1$ to $P_2$ along a curve $\bar{C}$.
  2. Consider another curve $\bar{C’}$ also passing through points $P_1$ and $P_2$ but in a longer path.
  3. Applying Lagrange’s invariant to the loop and defining $\bar{C’}$ to be from $P_2$ to $P_1$ to obtain a negative sign:
    • \begin{equation} \int_{\bar{C}} n\hat{s}\cdot{dr}-\int_{\bar{C’}}n\hat{s}\cdot{dr} = 0 \end{equation}
  4. Now here is where we can begin to get a cool result about rays. Since $\bar{C}$ is a ray, then the direction of the path length $\hat{s}$ and the infinitesimal integration value $dr$ are parallel. This then implies:
    • \begin{equation} \int_{\bar{C}}n ds = \int_{\bar{C}}n\hat{s}\cdot{dr} \end{equation}
  5. Then, by the triangle inequality (approximating a triangle between the path of $\bar{C}$ and $\bar{C’}$), then we would have that the path of $\bar{C’}$, $\hat{s}\cdot{dr}$ is $\leq~ds$. This means that the two paths would only be equal whenever each point of $\bar{C’}$ is the same as $\bar{C}$, meaning that $\bar{C’}$ lies along the $C$ ray.

What this result means is that the path of wavefront normals (rays) is also the path of least time. Thus, rays are pretty smart! However, with all of this, we still have not found Fermat’s theorem. To do this, we will use an argument from Born and Wolf.

Now that we have obtained Fermat’s Theorem, it’s time to put it to work! Namely, we can derive the same reflection and transmission conditions by applying Fermat’s theorem as we did by applying boundary conditions in Section 3.


Here is another cool result of Fermat’s principle: if we have all the rays from a certain source or “object” $P_1$ converge to another point $P_2$, this is called the “image” of $P_1$. Then, also necessarily, each of the rays from $P_1$ to $P_2$ are the shortest path, and thus are all equal.

%d bloggers like this: