University of Oregon
Optics is both a very old and a very contemporary field of research. Mirrors made millennia ago as well as the advanced imaging methods that decorate recent years’ lists of scientific breakthroughs (Betzig et al. 2006, Bates et al. 2007, Hell 2007, Abbott 2009) can all be understood with the same physical framework. We will explore the basic physics of optics in this chapter, intended to serve as a review of elementary principles, or as an introduction for readers new to optics. Our treatment is necessarily brief and minimal—the reader interested in further elaboration should consult a textbook devoted to optics, such as Hecht (2002) or Born and Wolf (1997).
Insightful experiments by Hans Christian ∅rsted, Michael Faraday, and others in the first half of the nineteenth century revealed the principle of electromagnetic induction: a changing magnetic field gives rise to an electric field, and, conversely, a changing electric field creates a magnetic field. Later, James Clerk Maxwell synthesized these and other observations into a set of succinct mathematical expressions, known as Maxwell’s equations, which encapsulate the core of classical electromagnetism. It follows simply from these that electric ( $\overrightarrow{E}$
) and magnetic ( $\overrightarrow{B}$ ) fields in vacuum can be connected by the relationswhere
Both of these expressions have form of a wave equation
which admits solutions of the form $\mathrm{\psi}(\overrightarrow{r},t)=f(\overrightarrow{r}-\overrightarrow{v}t)$
, i.e., waves traveling through space ( $\overrightarrow{r}$ ) and time (t) with velocity $\overrightarrow{v}$ . Maxwell therefore realized that electric and magnetic fields can propagate as traveling waves, with a speed that is a simple function of electrostatic and magnetostatic constants: c = (ε_{0}μ_{0})^{−1/2}. Inserting values for ε_{0} and μ_{0} yields c = 3.0 × 10^{8} m/s, in striking correspondence to the speed of light, which had been measured with few-percent accuracy by the mid-nineteenth century. Especially following the experiments of Heinrich Rudolph Hertz, in which electromagnetic waves were generated and detected, it became clear that light is an electromagnetic wave and that visible light is but one part of a broader electromagnetic spectrum.As discussed above, electric and magnetic fields in space obey wave equations. To define the terms and symbols related to wave motion, we will first consider a one-dimensional wave equation, the simplest solution to which is a sinusoidal traveling wave of amplitude A, wavenumber k, and angular frequency ω: ψ(x, t) = A cos(kx − ωt − δ) = Re{A exp [j(kx − ωt − δ)]}, where $j=\sqrt{-1}$
and δ is a phase offset. (We generally will not bother explicitly writing that the real part of the complex exponential is to be considered.) The wavelength is given by λ = 2π/k, and the frequency by f = ω/2π; if we consider a particular position in space, ψ oscillates with period T = f ^{−1}. The wave speed is related to the other variables by v = ωk ^{−1} = λf, as the reader may wish to illustrate by drawing the wave for various values of t. The argument of the oscillatory function is often referred to as the phase: ϕ(x, t) = kx − ωt − δ. Considering a particular moment in time, the phase advances by 2π over a distance is given by λ; over an arbitrary distance Δx, the phase shift is Δϕ = 2πΔxλ^{−1}.For the one-dimensional traveling wave noted above, each point in space corresponds to a particular phase. In two- or three-dimensions, more complex structures arise. It is useful to consider points of equal phase, which we will refer to as wavefronts.
Plane waves. A simple and very useful construction is the plane wave. Let us illustrate this for a two-dimensional wave (Figure 3.1), in which we can plot the value of ψ along the third dimension.
Note that ψ only varies along one spatial dimension (in this case, x). Contours of equal phase (i.e., wavefronts) are lines in the xy plane. As the wave travels, for the example shown in Figure 3.1, it moves in the $\hat{x}$
-direction—i.e., parallel to a wavevector, $\overrightarrow{k}$ , that is perpendicular to these lines of constant phase and parallel to $\hat{x}$ . We can write ψ(x, y) = A exp[j(kx − ω − δ], or $\psi \left(\overrightarrow{r}\right)=A\text{}\text{exp}\left[j(\overrightarrow{k}\cdot \overrightarrow{r}-\omega t-\delta )\right]$ , where $\overrightarrow{r}$ is a vector in the xy plane—note that the dot product selects the x-component of $\overrightarrow{r}$ .Figure 3.1 A two-dimensional plane wave: ψ(x,y) = cos(kx−ωt), plotted at time t = 0.
For a three-dimensional plane wave, positions of constant phase (i.e., wavefronts) form a set of parallel planes. This is a good description of many sorts of light beams. Furthermore, any three-dimensional wave can be expressed as a combination of plane waves by Fourier analysis. The three-dimensional plane wave is described by $\psi \left(\overline{r}\right)=A\text{}\text{exp}\left[j(\overline{k}\cdot \overline{r}-\omega t-\delta )\right]$
, where $\overrightarrow{r}$ is any vector in three-dimensional space. We will show this explicitly: consider a position vector $\overrightarrow{r}=x\hat{x}+y\hat{y}+z\hat{z}$ , where ^ indicates a unit vector, and some particular vector ${\overrightarrow{r}}_{0}$ . Their difference:Consider the set of points { $\overrightarrow{r}$
} described by $(\overrightarrow{r}-{\overrightarrow{r}}_{0})\cdot \overrightarrow{k}=0$ . As $\overrightarrow{r}$ varies, this sweeps out a plane perpendicular to $\overrightarrow{k}$ . Expanding this: $(\overrightarrow{r}-{\overrightarrow{r}}_{0})\cdot \overrightarrow{k}={k}_{x}(x-{x}_{0})+{k}_{y}(y-{y}_{0})+{k}_{z}(z-{z}_{0})=0$ , or k_{x}x + k_{y}y + k_{z}z = a, where a = k _{ x } x _{0} + k_{y}y _{0} + k_{z}z _{0} is a constant. Therefore, the equation of a plane perpendicular to $\overrightarrow{k}$ is $\overrightarrow{k}\cdot \overrightarrow{r}$ = constant = a. The set of planes over which $\psi \left(\overrightarrow{r}\right)$ (at t = 0) varies sinusoidally is $\psi \left(\overrightarrow{r}\right)=A\text{}\text{cos}(\overrightarrow{k}\cdot \overrightarrow{r})$ or $\psi \left(\overrightarrow{r}\right)=A\text{}\text{exp}(j\text{}\overrightarrow{k}\cdot \overrightarrow{r})$ . This function is periodic if $\overrightarrow{k}\cdot \overrightarrow{r}$ changes by 2π, i.e., $\left|\overrightarrow{k}\right|\lambda =2\pi $ , or $k=\left|\overrightarrow{k}\right|=2\pi /\lambda $ , as expected. The traveling plane wave is described by $\psi \left(\overrightarrow{r}\right)=A\text{}\text{exp}\left[j(\overrightarrow{k}\cdot \overrightarrow{r}-\omega t-\delta )\right]$ . To reiterate, the wavefronts of a three-dimensional plane wave are planes. Typically, we will only draw wavefronts that are separated in phase by Δϕ = 2π, which are therefore spatially separated by distance λ. The wavevector $\overrightarrow{k}$ points perpendicular to these planes. Often, one describes the wave by a ray that points along $\overrightarrow{k}$ .Spherical waves. A point-source of light emits spherical waves—the wavefronts are concentric spheres that travel away from the point. The wave function is
where
The amplitude decreases with r, for reasons that will become clear shortly.
Cylindrical waves. A line-source of light, for example, a slit, emits cylindrical waves—the wavefronts are concentric cylinders that travel away from the line. The wave function is
where
Again, the amplitude decreases with r.
Figure 3.2 The sum ψ = ψ1 + ψ2 (black) of the waves ψ1 = 1.0 cos(kx) (light gray) and ψ2 = 0.95 cos(kx-δ) (medium gray), plotted for various values of δ.
The wave equation is linear in ψ; therefore, its solutions obey the principle of superposition: If ψ_{1} and ψ_{2} each satisfy the wave equation, then ψ = ψ_{1} + ψ_{2} is also a solution. The relative phase difference between ψ_{1} and ψ_{2} is important in determining their interference:
Figure 3.2 shows an illustration of the superposition of two sine waves. I have plotted ψ_{1} = 1.0 cos(kx), ψ_{2} = 0.95 cos(kx − δ), and ψ = ψ_{1} + ψ_{2} for various values of δ. (I have chosen slightly different amplitudes for these two waves to make the illustrations clearer.)
Note that a phase difference δ = 0 leads to constructive interference, and a phase difference δ = π leads to destructive interference.
We noted above that electric and magnetic fields can propagate in free space as waves, with speed c = 3.0 × 10^{8} m/s. In a transparent material of index of refraction n (related to the polariz-ability of the material), fields also propagate as waves, but more slowly, with speed v = c/n. For air at 20°C and atmospheric pressure, n = 1.0003. For water at 20°C, n = 1.33. For typical glass, n = 1.46. The frequency of the wave is unchanged from its value in vacuum—the rate of oscillation of the atoms excited by the electric field is constant. The wavelength of the light is different from its value in vacuum and obeys the general relation encountered earlier: v = λf. Therefore, waves in matter are shorter than in free space: λ = v/f = c/nf. The wavelength in matter λ = λ_{0}/n, where λ_{0} is the free space wavelength, and so the phase shift corresponding to a change in position Δx along the wave is Δϕ = 2π(Δx/λ) = 2πn(Δx/λ_{0}).
Generally, when one states a wavelength for light, it is the free space wavelength, λ_{0}, that is being referred to—we say that orange light has a wavelength of ≃600 nm, even though when it enters your eye (n = 1.3), its wavelength shortens to = 450 nm.
Another consequence of electrodynamics is that the electric and magnetic field vectors at any point are perpendicular to one another and to the wave’s propagation direction (Figure 3.3). The magnitudes of the field amplitudes are related by $\left|\overrightarrow{E}\right|=v\left|\overrightarrow{B}\right|$
, where v is the speed. The direction of $\overrightarrow{E}$ specifies the polarization of the wave. If this direction is constant, as in Figure 3.4, we say that the wave is linearly polarized (or plane polarized). In Figure 3.4, for example, note that $\overrightarrow{E}$ is always parallel to the x-axis (in other words, $\overrightarrow{E}=E(z,t)\hat{z})$ . Waves do not have to be plane polarized and can do a variety of interesting things. If the direction of $\overrightarrow{E}$ rotates as the wave propagates, then we have circular or elliptical polarization. (We will not go into the difference between the two—the reader can explore this as well as other constructions, such as radial polarization.)Figure 3.3 Electric (dark gray) and magnetic (light gray) fields of a plane-polarized electromagnetic wave. The black arrow indicates the propagation direction, perpendicular to E → and B → .
Figure 3.4 The coherence length, L c, describes the spatial extent over which wavefronts (planes that differ by a phase shift of 2π) are separated by integer multiples of the wavelength. Over distances larger than ≈ L c , the coherence of the wave with itself—the ability to translate by an integer number of wavelengths and “match up”—is lost.
Electromagnetic waves carry energy and momentum. The power per unit area crossing a surface is $\overrightarrow{S}={c}^{2}{\in}_{0}\overrightarrow{E}\times \overrightarrow{B}$
, known as the Poynting vector. Note that it points along the propagation direction (i.e., parallel to $\overrightarrow{k}$ ), not surprisingly. Because $\left|\overrightarrow{E}\right|=v\left|\overrightarrow{B}\right|$ , $\overrightarrow{S}$ is proportional to ${\left|\overrightarrow{E}\right|}^{2}$ .The intensity (or irradiance), I, of the wave is the average energy carried per unit area per unit time, i.e., the power per unit area. It is the intensity, not the electric field directly, that we “see” as brightness. “Average” means that we consider the average power over a period. (Note that the intensity is a number, not a vector.) Since $\overrightarrow{S}$
is proportional to ${\left|\overrightarrow{E}\right|}^{2}$ , the intensity of an electromagnetic wave is proportional to ${\left|\overrightarrow{E}\right|}^{2}$ as well. This is in general true for vibrations and waves: the energy of a wave is proportional to the square of its amplitude, and thereforeThe principle of conservation of energy and the proportionality of $\overrightarrow{S}$
and I is on ${\left|\overrightarrow{E}\right|}^{2}$ explain the decaying amplitude of the spherical and cylindrical waves discussed in Section 3.2.2. For a spherical wave, integrating the power carried by the wave over a shell of radius r surrounding the source must give a result that is independent of r—all the power must cross the shell, regardless of the size of the shell. The shell area scales as r ^{2} and so $\overrightarrow{S}$ must scale as r ^{−2} for the product to be independent of r, from which we conclude that the amplitude scales as $\sqrt{{r}^{-2}}={r}^{-1}$ for a three-dimensional spherical wave.We often can ignore the constant of proportionality, being concerned with relative intensities. However, for completeness: $I=(1/2){\varepsilon}_{0}c{E}_{0}^{2}$
in vacuum, where E_{0} is the electric field amplitude. In matter, $I=(1/2){\varepsilon}_{0}v{E}_{0}^{2}$ , where v is the speed of the wave and ε is the permittivity of the medium. The ability of light to carry energy and momentum has become especially important in recent years with the development of optical trapping techniques, in which light itself is used to grab, pull, push, and twist microscopic objects.Though it travels as a wave, light carries energy in discrete, quantized packets. This realization, primarily by Max Planck and Albert Einstein in the early twentieth century, marked the birth of quantum mechanics. The energy of a photon, the quantized “unit” of light, is proportional to its frequency and hence inversely proportional to its wavelength. More precisely, the photon energy E = hf = hc/λ, where h = 6.626 × 10^{−34} m^{2} kg/s is Planck’s constant. Photons of lower wavelength have more energy. This explains, for example, why the emission of light from fluorescent molecules necessarily occurs at higher wavelengths than does absorption: a photon is absorbed, and some energy is converted into nonradiative (e.g., vibrational) modes, leaving a smaller quantum of energy for emission.
The range of wavelengths of electromagnetic waves that are relevant to science and technology is enormous, ranging from very high-energy gamma rays spouting from astrophysical sources (λ ≈ 10^{−13} m) to x-rays used to probe molecular structure (λ ≈ 10^{−10} m) to microwaves (λ ≈ 10^{−2} m) to radio waves (λ> 1 m). “Visible light” spans the tiny range of wavelengths from about 400 (violet/blue) to 800 nm (red), yet its correspondence to the energetics of electronic transitions in molecules and, relatedly, our ability to see it, makes it an immensely useful part of the electromagnetic spectrum.
We have been considering waves as ideal sinusoidal forms that oscillate at a unique frequency and extend infinitely through space. For such a perfectly coherent wave, the wavefronts are always separated by distance λ, and knowing the phase at one point specifies it at all points. For real waves, this is not exactly the case. The wavefronts of light from a real, imperfect wave, are separated by λ if we consider some finite span of approximate size L _{ c }, but if we look at larger lengths, the phase relations appear “randomized”—see Figure 3.4.
This lack of perfect coherence arises from the emission of any real source not being perfectly monochromatic, but rather consisting of a range of output frequencies, Δf (This is due to factors such as the finite linewidth of electronic transitions and the thermal velocities of atoms and molecules.) Roughly, L _{ c } = c/n Δf Furthermore, the light from extended sources such as an incandescent light bulb or the sun is emitted by many independent sources throughout the object, and each emitted wave has a random phase relative to any other. Such sources are referred to as incoherent light sources. The length L _{ c } referred to above is called the coherence length—it is about 10 μm (around 20 λ) for a light bulb. (There is a more precise way to define the coherence length that we will not go into here.)
A laser is a coherent light source—all the waves emitted by the device have the same phase. Moreover, L _{ c } is typically around 1 m (>10^{6} λ) and can even be kilometers in length—a good approximation to our ideal infinite wave. This remarkable property of lasers contributes to their tremendous utility, as will be evident later in this book.
For centuries, debate raged over whether light is a wave or a particle—an interesting history that we will not go into. Whether or not it is important to consider the wave-nature of light in describing its propagation, rather than simply imagining rays of light that travel in simple geometric paths, depends on the spatial scale of the phenomena being considered. For features that are not large compared to the wavelength (λ), for example, visible light passing through micron-sized slits or kilometer-sized radio waves detected by an array of dishes, the wave nature of light is inescapable. Light’s interference with itself determines its intensity profile, and diffraction—this interference being induced by barriers or obstacles—is paramount. This regime in which the wave nature of light is important is called physical optics. The regime in which the system size is much greater than the wavelength of light, and hence wave properties are relatively unimportant, is called geometric optics or ray optics.
Diffraction is a general property of waves, and the phenomena we will explore in this section also apply to water waves, sound waves, etc.
Consider a plane wave incident on a barrier with two slits, separated by a distance D (Figure 3.5). (Imagine the slits themselves to have negligible width—we will return to this later.) Each slit acts as a point-source for waves, which continue propagating to the right in the figure. Far to the right is a screen. We want to know the intensity, I, of the light hitting the screen as a function of θ, the angle relative to a line perpendicular to the barrier (see Figure 3.5).
The electric field of the incident wave is
with k = 2π/λ, as usual (see Section 3.2). We could add any phase offset to this—it does not matter, as we will see shortly. We are concerned with the light hitting a far-off screen, at angle θ. If the screen were close by, a ray would have to leave slit #1 at some angle θ_{1} and slit #2 at some angle θ_{2}, where θ_{1} and θ_{2} may be different, to both reach the screen at angle θ. However, as the screen moves farther and farther away, both θ_{1} and θ_{2} approach θ—try drawing this if it’s not evident. So, to consider I(θ), we need to consider rays leaving each slit at angle θ. Let us define our coordinates so that the barrier is at x = 0.
The two rays that travel at angle θ are indicated in Figure 3.6; their fields are
Figure 3.5 Two-slit interference: A plane wave is incident from the left on two slits of negligible width separated by distance D. Each slit acts like a point source for waves continuing to the right; the two resulting waves interfere with one another. This interference is manifested in the pattern of light intensity observed on a distant screen, and is a function of the wavelength, D, and the angle θ.
Figure 3.6 The geometry of light propagation for two-slit interference. (a) For the angle θ illustrated, light traveling from the lower slit (slit #2) travels a greater distance than light from slit #1. The extra path length is denoted δ and is the reason for a phase difference between the two waves. (b) A “zoomed in” view of the geometry relating D, δ, and θ.
where we have defined s as the coordinate in the “tilted” θ-direction, and we have indicated the extra distance that ray 2 has to travel by δ. Note that ${\overrightarrow{E}}_{1}(s=0)$
and ${\overrightarrow{E}}_{2}(s=-\delta )$ have the same phase, as they should, since they come from the same incident wave.Graphically, we can see that if δ is an integer multiple of λ, the two waves will add constructively (Figure 3.7). If δ is a half-integer multiple of λ, the two waves will add destructively and give zero light intensity.
Let us examine this mathematically. The superposition of the two electric fields:
Figure 3.7 (a) If the extra path length, δ, between the two paths is an integer multiple of the wavelength, the two waves will constructively interfere, leading to high intensity at the screen. (b) If the extra path length δ between the two paths is a half-integer multiple of the wavelength, the two waves will destructively interfere, leading to zero intensity at the screen—note that when wave #1 is “up,” wave #2 is “down” and vice versa.
From geometry, δ = D sin θ (Figure 3.6b), so
The intensity (see Section 3.2.4.3) is given by $I\text{}\text{}\alpha \text{}{\left|\overrightarrow{E}\right|}^{2}=\overrightarrow{E}\cdot \overrightarrow{E}*$
. Therefore,making use of the Euler relation cos(x) = (1/2)(exp(jx) + exp(−jx)). Via the identity 2[1 + cos(2x)] = cos^{2}(x), the intensity becomes
Note that without interference—just considering the incident plane wave, for example, $I\text{}\text{}\alpha \text{}{\left|\overrightarrow{{E}_{0}}\right|}^{2}$
, with the same constant of proportionality (c’s etc.)—we will define this intensity as I _{0}. Therefore,As we saw graphically, if D sin θ = mλ, where m is an integer, the cos^{2} factor is maximal, and we have constructive interference. If D sin θ = (m/2)λ, where m is an odd integer (i.e., m/2 = 1/2, 3/2, 5/2,...), the cos^{2} factor is zero, and we have destructive interference. The intensity pattern, we see on the screen, therefore, is not uniform but rather has a sequence of maxima and minima. This is plotted in Figure 3.8.
Note that the maximal value of the intensity is four times that of a single wave. If interference “did not exist,” we would have light from the two slits combining to simply give twice the single-wave intensity. With interference, we have bright peaks with four times the intensity and dark minima with zero intensity.
Now consider N slits, each separated by distance D (drawn in Figure 3.9a for N = 5).
Building on our N = 2 analysis above, we can write the total electric field as
Figure 3.8 The two-slit intensity function: I = 4I 0 cos2 (πD sin θλ−1).
Figure 3.9 N-slit interference. (a) Geometry. Each slit is of negligible width and is separated from its neighbor by distance D. In the example drawn, N = 5. (b) The N-slit intensity function, I = I 0 sin 2 ( N π D sin θ / λ ) sin 2 ( π D sin θ / λ ) , plotted for N = 5. Note that there are infinitely many large maxima located at angles for which sin (θ) = mλ/D, where m is any integer. Between each pair of these peaks are N − 2 smaller local maxima and N − 1 zeros. In this example, the zeros are located at sin(θ) = λ/5D, 2λ/5D, 3λ/5D, 4/5D.
where, for convenience, we have defined α ≡ (2π/λ) D sin θ. Note that this is
The terms in the braces form a finite geometric series, since each term is equal to the preceding one times e^{jα} . Therefore,
We can simplify the expression in the parentheses by factoring out exponentials from the numerator and denominator:
using the Euler relation, sin(x) = (1/2j)(exp(jx) − exp(−jx)).
Therefore, $\overrightarrow{E}={\overrightarrow{E}}_{0}\text{}\text{exp}\left[j(\mathit{\text{ks}}-\omega t)\right]\text{}\text{exp}(-\frac{{\displaystyle j(N-1)\alpha}}{{\displaystyle 2}})\frac{{\displaystyle \text{sin}(N\alpha /2)}}{{\displaystyle \text{sin}(\alpha /2)}}$
.The intensity $I\text{}\alpha \text{}{\left|\overrightarrow{E}\right|}^{2}$
:Explicitly writing the α’s:
This is plotted in Figure 3.9b for N = 5.
Maxima and minima. We see that the numerator of I(θ) is zero when NπD sin θ/λ = mπ, i.e., D sin θ/λ = m/N, where m is an integer—but note that both numerator and denominator are zero if m is an integer multiple of N. We see that the denominator is zero when πD sin θ/λ = m′π, i.e., D sin π/λ = m′, where m′ is an integer—in this case, however, the numerator must also be zero, since NπD sin θ/λ = Nm′π and N is an integer. If both the numerator and denominator are zero I → I_{0} N ^{2}. A more detailed summary of the locations of maxima and minima is a useful exercise for the reader.
As illustrated in the plot of I(θ) for N = 5 slits (Figure 3.9), there are large maxima separated in angle by sin θ = λ/D. The form of I(θ) reveals that this angular spacing between the peaks is independent of the number of slits. The angular width of the large peaks is approximately Δ sin θ ≈ λ/ND—half the distance in angle to the first local minimum—which gets sharper as we increase the number of slits. This is a very useful feature, as we will see shortly.
Suppose we have a telescope that collects light from a star, and we want to measure the star’s spectrum—i.e., the intensity as a function of wavelength, I(λ). How can we do this? Our detector (like most good detectors, at least over some range of wavelengths) simply measures intensity, regardless of the wavelength of the light hitting it.
We can pass the light though an N-slit grating, or, equiva-lently, reflect it off a surface with N mirrors—a diffraction grating. How does this help? Light of wavelength λ_{1} is deflected to angle λ_{1}/D. By this, we mean that the maximal intensity peak for light of this “color” is at the angle given by sin θ_{1} = λ_{1}/D, and integer multiples, as in Section 3.3.2; typically, the angles involved are small, so sin θ ≈ θ. Light of wavelength λ_{2} is deflected to angle λ_{2}/D, etc. Moving our detector to various positions on the screen and measuring the intensity as a function of angle on the screen reveals the intensity as a function of wavelength! (In other words I(λ_{1}) = I(θ_{1}), I(λ_{2}) = I(θ_{2}), etc).
The sharper the diffraction peaks (high N), the finer the resolution in λ—see the end of the preceding section. The discovery (within the past ≈ 10 years) of planets outside our solar system— one of the most remarkable discoveries of recent history—used the approach outlined above to measure tiny shifts in stellar spectra due to the influence of the orbiting planets. The typical N of the diffraction gratings was around 100,000!
In our initial discussion of two-slit interference, we neglected the finite width of the diffraction grating. This finite width is important—just as waves from each slit interfere with one another, waves traversing various paths through a single slit will interfere with one another, and lead to diffraction. Fortunately, it is easy to analyze single-slit interference—it is simply the limit of the N-slit case discussed in Section 3.2.2 as N → ∞, D → 0, and the product ND → a, where a is the width of the slit. The reader can verify that
where β = πα sin θλ^{−1}, as plotted in Figure 3.10.
Figure 3.10 The intensity function of a single slit of width a. Note that the angular width of the peak is approximately λ/a.
This single-slit diffraction pattern is exceptionally important. Any optical element—the pupil of your eye, a telescope mirror, a microscope lens, etc—is an aperture, and the I(θ) above describes how light travels through it. Why?
We have been considering light leaving an aperture, i.e., being “transmitted,” and reaching a screen, where it is “received.” But look carefully at Figure 3.5, 3.7, or 3.9—our wave interference scenarios. Our analysis did not invoke at all the direction the waves were traveling, only the path length difference between various paths. So we would get the same interference effects if light were transmitted from a point source at angle θ on the screen, passed through aperture(s), and were detected at the left.
Consider light from a point source (e.g., a star) located at the far-off “screen.” We observe the point source by detecting the intensity passing through a single-slit aperture of width a (e.g., a telescope lens plus an intensity detector). We tilt the barrier containing our slit (e.g., our telescope), so that the angular position of the star of interest is θ_{1} = 0; this angle gives the maximum of the single slit intensity function, and we happily detect light from the star. We tilt the telescope; at the new θ_{2} = 0 there is no star; we see no light. We tilt further; at this third θ_{3} = 0 there is light again. “Aha!,” we say, “We have seen two stars!”
Now suppose there were two stars very close to one another in angular position—let us say the difference in sin θ is just 0.1λ/a. (We typically deal with small angles, by the way, so sin θ = θ.) Since the width of our interference function is ≈ λ/a, no matter how precisely we point at one star, we will be detecting a sizeable fraction of the intensity of the other—there is no way we can tell that we are looking at two stars rather than only one!
The angular limit of resolution, often just referred to as the resolution, of our single-slit aperture—the minimum angular separation that two objects must have in order to be able to distinguish them—is θ_{res} ≈ λ/a, where a is the aperture size. (It is an “approximately equals” sign, because there are slightly different ways of defining criteria for distinguishability that will not concern us here; most commonly, one uses the “Rayleigh criterion” θ_{res} = 1.22 λ/a.) Note that smaller θ_{res} means that we can more finely distinguish objects—we can “see” better—and that this can be achieved by increasing the size of our aperture. This is why one builds big telescopes. (Big telescopes have another, unrelated, advantage: they collect more light.)
This issue of diffraction sets the fundamental limit on the performance of telescopes, microscopes, and other optical devices. Regarding microscopy, and all the diverse applications of it described, for example, in this book, the above angular description of resolution together with expressions for the focusing ability of lenses (Section 3.5 and Chapter 1) set a spatial limit for optical resolution. Roughly, objects separated by distance less than Δx ≈ λ cannot be resolved as separate entities. We will revisit the diffraction limit on resolution in Section 3.5.8.
The diffraction of light as it passes through an aperture maps the emission of an ideal pointlike source onto the observed pattern of intensity. For a one-dimensional slit, this mapping is given by the profile shown in Figure 3.10. In microscopy, one is interested in the analogous pattern caused by diffraction through the two-dimensional, typically circular, objective lens. One refers to the resulting intensity profile of a point source as the point spread function (PSF). In other words, the image of a point source (e.g., a fluorescent molecule) will not look like a point, but will look like the PSF. For an ideal aberration-free circular lens of radius a and focal length f (defined in Section 3.5), the PSF is given (Gu 1999) by
where
This function is illustrated in Figure 3.11; note that the width of the intensity peak is roughly λ/2.
Figure 3.11 The PSF for an ideal circular aperture, plotted for a/f = 0.7 and λ = 0.6 μm.
When light travels from one medium to another, it may change direction. This phenomenon—familiar whenever we see the “bent” shape of a straw poking out of a glass of water—is known as refraction. (The light may also change its intensity at a boundary between media, which we will discuss in Section 3.6.) Refraction, like diffraction, is inherently a consequence of the wave nature of light, and our analysis below applies also to waves in water, sound waves, etc.
The basic setup for issues of refraction is shown in Figure 3.12a: a ray of light crosses the boundary between two media, with indices of refraction n _{1} and n _{2}, making angles θ_{1} and θ_{2} with respect to the normal in each medium, respectively. The question is: How are θ_{1} and θ_{2} related? The answer is crucial to the propagation of light and to the design of lenses and other optical elements.
The answer, as we will show, is that light obeys Snell’s law:
There are several ways to derive this result. We could directly examine the wave equations for electromagnetic fields and look for solutions consistent with the presence of a boundary between two media, but this would be both painful and unilluminat-ing. There are, fortunately, simpler ways of thinking about light propagation.
A general principle describing wave motion was put forth by Pierre de Fermat in the seventeenth century. It is sometimes stated as “light travels from one point to another along the path that takes the minimal amount of time.” This is not quite correct—we will fix it in a few paragraphs—but it is a good place to start. We will also return to justifying Fermat’s principle shortly. First, let us use it to derive Snell’s law.
Imagine you are on a beach, and someone in the ocean is drowning. You rush out to help, which requires both running on land and swimming in the water. You can run faster than you can swim. What path should you take? With a bit of thought, you will realize that a straight line between you and the drowning person is not the best idea—rather, you should reduce the length of the swim to minimize the overall time to your target. How much should you run and how much should you swim?
The same dilemma is encountered by our light beam, traveling from position A in a medium of index of refraction n _{1} (your position on the beach, in the above analogy) to position B in a medium of n _{2}(the swimmer’s position, in the water) in Figure 3.12b. The speed of light in medium 1 is v _{1} = c/n _{1} and in medium 2 is v _{2} = c/n _{2}. Within each medium, the light travels in a straight line—itself a consequence of Fermat’s principle, as you can convince yourself. There are many possible paths between A and B, as illustrated in the figure, that we can label based on the position x at which they cross the interface. One of these—let us call it the one that goes through position x _{0}—minimizes the total travel time. What is this path? (What is x _{0}?)
The travel time in medium 1, t _{1}, is the distance traveled in medium 1 divided by the speed in medium 1: ${t}_{1}=({n}_{1}/c)\sqrt{{y}_{1}^{2}+{x}^{2}}$
; similarly, the travel time in medium 2 is ${t}_{2}=({n}_{2}/c)\sqrt{{y}_{2}^{2}+(L-x{)}^{2}}$ . The total travel time is t = t _{1} = t _{2}. To find the minimal time, we determine the x for which dt/dx = 0; call this x _{0}:Figure 3.12 Refraction. (a) The path taken by light traveling between two media bends at the interface; the relation between the angles θ1 and θ2 depends on the indices of refraction of the two materials, and is given by Snell’s law. (b) Light traveling from point A in medium 1 to point B in medium 2 can follow infinitely many possible paths, four of which are illustrated here. Fermat’s principle states that the actual path the light takes is that for which the total travel time is minimal (actually extremal, as discussed in the text), from which Snell’s law follows.
(To show that x _{0} is a minimum, we should also examine the second derivative of t, but as we will see later, it does not actually matter if x _{0} is the site of a minimum or a maximum. Furthermore, we can intuit from the form of t(x) that this extre-mum is, in fact, a minimum.)
Note from geometry that
Therefore, the above condition becomes
We have shown that when the above condition is met, the travel time for light propagation is minimized. This is Snell’s law.
We can also use Fermat’s principle to derive Snell’s law of reflection, which states that the reflected ray makes the same angle with the interface as the incident ray.
Now let us explain Fermat’s principle. Suppose light travels along many paths, all of which interfere with one another. Paths for which the phase difference is near zero will constructively interfere. Consider the minimal time path. As we saw in our derivation of Snell’s law above, this is the path for which • n_{i}d_{i} , is minimal, where the sum runs over all the segments being considered (two segments in the above example), and d_{i} is the length of segment i. From Section 3.2, minimizing • n_{i}d_{i} is the equivalent to minimizing the phase traversed by the wave along the path. Therefore, the minimal time path is also the path of minimal phase and is also the path of minimal • n_{i}d_{i} —all these statements are equivalent. This sum • n_{i}d_{i} is more properly written as an integral and is called the optical path length (OPL): $\text{OPL}={\int}_{A}^{B}\text{}n\left(x\right)\text{d}x$
.Why should the path of minimal OPL be the path light takes? Let us call this path P. By construction, $\frac{{\displaystyle \text{d}\left(\text{OPL}\right)}}{{\displaystyle \text{d}\u25afs\u25af}}{|}_{p}$
, is zero, where s indicates any variable that characterizes the paths. Therefore, nearby paths are similar in phase and so constructively interfere. Consider a path for which $\frac{{\displaystyle \text{d}\left(\text{OPL}\right)}}{{\displaystyle \text{d}\u25afs\u25af}}$ is not zero—moving to a slightly different path, the OPL can change appreciably, perhaps higher in one direction, lower in another, etc., and so we would not expect constructive interference.You may be thinking: the minimal OPL path is not the only one for which we can guarantee constructive interference. What about the maximal OPL path? This too provides constructive interference. And so the proper formulation of Fermat’s principle is that light travels alongpaths of extremal optical path length. Typically, these are minimal OPL paths; but, in certain geometries, they can be maximal OPL paths as well.
Let us look more carefully at Snell’s law. What if n _{1} > n _{2}, and θ_{1}, is large, so that (n _{1}/n _{2})sin θ_{1} > 1? What θ_{2} can satisfy Snell’s law: n _{1} sin θ_{1} = n _{2} sin θ_{2}? None. This means that there can be no wave transmitted to medium 2; the light from medium 1 is totally reflected at the interface, a condition known as total internal reflection.
Fiber optics, which underpin much of modern communication, work because of total internal reflection. Consider a glass fiber (n = 1.5) surrounded by air (n = 1.0). We want the light to travel along the fiber and not leak out into the air. This is automatically enforced by total internal reflection due to the higher index of refraction of the glass and the large incident angles of light traveling along the fiber (as long as the fiber is not severely bent). In a fiber optic cable, light can propagate for kilometers with losses of a fraction of a percent!
A careful treatment of electromagnetic fields would show that the light intensity in medium 2 is not exactly zero. Rather, an “evanescent wave” decays to zero over a short distance, comparable to λ, in medium 2. This attribute finds applications in biophysical imaging. In total internal reflection fluorescence microscopy, excitation of fluorescent probes by the evanescent wave allows discrimination of molecules near interface despite the presence of large concentrations of probes in the bulk (Axelrod et al. 1984, Axelrod 2001, Groves et al. 2009).
We often wish to collect and reshape electromagnetic wavefronts to create images of objects. Lenses are powerful tools for achieving these goals and are obviously very useful, forming the essential imaging elements of telescopes, microscopes, cameras, your eyes, and many other devices. The “ideal” shape of a lens surface is generally some non-spherical conic section (hyperbola, parabola, etc.), but, in practice, spherical lenses are typically used, since they are much easier to make than aspheric (non-spherical) lenses. Typically, one uses spherical lenses and then corrects for their aberrations (nonideal behavior), e.g., by using combinations of lenses. We will briefly explore lenses.
Consider a point source emitting spherical waves from point S, in a medium of index of refraction n _{1} (see Figure 3.13). Can we construct a spherical interface of radius R that focuses the emitted light to point P, regardless of where it hits the interface? What should R be? Point P is embedded in a medium of index of refraction n _{2}; we are considering the shape of the interface between media 1 and 2. Consider n _{2} > n _{1}, so that the rays from S will be refracted “inwards.”
Figure 3.13 A spherical interface. C, S, A, and P refer to particular points—the center of the spherical interface, the object point, the point at which the ray drawn hits the interface, and the image point, respectively. Italicized letters refer to distances. Greek letters refer to angles—note that α = ∠ASC and β = ∠CPA.
In Figure 3.13, Point C is the center of the sphere of radius R. The distance between the “object” point, S, and the interface is s _{0}, and the distance between the “image” point, P, and the interface is s _{i}. The angles that the incident and reflected rays make with respect to the normal to the interface are θ_{1} and θ_{2}. As usual θ_{1} and θ_{2} are related by Snell’s law: n _{1} sin θ_{1} = n _{2} sin θ_{2}. We can relate θ_{2} to β via the law of sines: (sin β/R) = (sin θ_{2}/(s _{i}−R)). Relating θ_{1} to α is not quite as transparent; first note that ∠SAC = π − θ_{1}, so sin (∠SAC) = sin(π − θ_{1}) = sin θ_{1}, and then apply the law of sines to ΔSAC to get (sin α/R) = (sin θ_{1}/(R + s _{o})). Inserting all this into Snell’s law:
More geometry: $\text{sin}\alpha =\frac{{\displaystyle y}}{{\displaystyle {l}_{o}}}=\frac{{\displaystyle y}}{\sqrt{{\displaystyle {s}_{o}^{2}+{y}^{2}}}}$
and $\text{sin}\beta =\frac{{\displaystyle y}}{{\displaystyle {l}_{i}}}=\frac{{\displaystyle y}}{\sqrt{{\displaystyle {s}_{i}^{2}+{y}^{2}}}}$ .From which ${\mathit{n}}_{\mathrm{1}}\mathrm{(}\mathit{R}+{\mathit{S}}_{\mathrm{o}}\mathrm{)}{s}_{\mathrm{i}}\sqrt{\mathrm{1}\mathrm{+}{\left(\frac{{\displaystyle \mathit{y}}}{{\displaystyle {\mathit{s}}_{\mathrm{i}}}}\right)}^{2}}={\mathit{n}}_{\mathrm{2}}\mathrm{(}{\mathit{s}}_{\mathrm{i}}\mathrm{-}\mathit{R}\mathrm{)}{s}_{\mathrm{o}}\sqrt{\mathrm{1}\mathrm{+}{\left(\frac{{\displaystyle \mathit{y}}}{{\displaystyle {\mathit{s}}_{\text{o}}}}\right)}^{2}}$
We have derived a relation that must hold for focusing at P to occur. In other words, we know what R we need—the R that satisfies the above expression. Unfortunately, it depends on y, the position at which our ray hits the interface! Therefore, different rays will not focus to the same image spot.
What we have shown is that a truly spherical interface will not serve as an ideal lens. There is a way out of this, however, which is to limit ourselves to the paraxial regime, meaning that we consider only light that is nearly parallel with the optical axis, SP. In other words, we consider small α and β. Therefore, y/s_{o} and y/s_{i} are small, allowing us to neglect them in the above equation: n _{1}(R + s _{ o }) s_{i} ≈ n _{2} (s _{i} − R) s _{o}, from which (n _{1}/s _{o}) + (n _{2}/s _{i}) = (n _{2} − n _{1})/R. A simple, useful relation! (By the way, we could also have derived this directly from Fermat’s principle, by determining the R for which SAP is an extremal path for any A.)
Should we be bothered by limiting ourselves to the paraxial case? Yes and no. In practice one does try to design optical systems such that beams are close to the center of spherical lens elements or, equivalently, to have one’s image and object distances be large compared to the size of the lens. If one does this, the above relation works very well. In practice, one works in the paraxial regime and applies additional corrections if necessary. We will continue limiting ourselves to the paraxial regime.
If R, n _{1}, and n _{2} are fixed, decreasing s _{o} means that s _{i}; increases (and vice versa), from the above boxed relation. Let us increase s _{i} until s _{i} → ∞; in other words, parallel rays emerge from the interface; what is s _{o}? From above: (n _{1}/s _{0}) + (n _{2}/∞) = (n _{2}−n _{1})/R, therefore s _{o} = (n _{1}/(n _{2}−n _{1}))R—an object at this distance focuses “to infinity.” We will call this distance the object focal length, f _{o} ≡ (n _{1}/(n _{2}−n _{1}))R. The spherical waves from the point source turn into plane waves.
The same holds if we do not consider a “semi-infinite” medium on the right, but rather a finite lens with a spherical surface at the left and a flat surface at the right—a planoconvex lens (see Figure 3.14). Note that since the right edge is flat, all rays are normal to it, and there is no “bending” of the rays due to refraction.
We can of course consider the opposite situation, in which plane waves (parallel rays from s _{0} = ∞) are focused to an image at some s _{i}. This particular s _{i} is denoted f _{i}, the image focal length.
Figure 3.14 A planoconvex lens. Light emanating from a source located at the object focal length is focused to an image distance of infinity (i.e., the rays become parallel).
Solving the lens equation above for s _{i} we have
If s _{o} > f _{o}, then s _{i} > 0, and point P is to the right of the interface. The rays from S converge at P. To an observer at the right, it looks as if light is emanating from point P. We have what is called a real image at P (see Figure 3.15a). If, for example, we put a power meter at P, we detect a high degree of power due to the focused light.
If s _{o} < f _{o}, then s _{i} < 0, and point P is to the left of the interface. The rays do not actually hit point P, but they appear to an observer at the right as if they are emanating from P (see Figure 3.15b). We have what is called a virtual image at P. If, for example, we put a power meter at P, we do not detect a high-intensity focused spot, since there is no “spot” there.
The same analysis works for concave lenses, but we treat R as negative (R < 0). Since
$\frac{{\displaystyle {n}_{1}}}{{\displaystyle {s}_{\text{o}}}}+\frac{{\displaystyle {n}_{2}}}{{\displaystyle {s}_{1}}}=\frac{{\displaystyle {n}_{2}-{n}_{1}}}{{\displaystyle R}},$
if n _{2} > n _{1} then s_{i} < 0—we have a virtual image.Let us glue one lens of radius of curvature R _{1} onto another of R _{2} (see Figure 3.16). We will consider thin lenses and so neglect the lens thickness d (i.e., we are assuming d is smaller than other lengths involved).
Figure 3.15 (a) Real and (b) virtual images. (a) Light emanates from P. (b) Light looks to an observer like it is emanating from point P located to the left of the interface.
Figure 3.16 Focusing light with a thin lens (imagine d is small).
The object and image lengths for “lens 1” (the left half of the lens) are related by
The image of lens 1 provides the “object” for lens 2. Therefore, s _{o2} = −s _{i1} + d ≈ −s _{i1}, where the negative sign arises, because, as defined above, a positive image length and a positive object length lie in opposite directions. Considering lens 2:
where we keep track of which index of refraction is which.
We need to adopt a consistent set of sign conventions for the radii. As noted above, a convex “left” lens has R > 0, and a concave “left” lens has R < 0. For the right side lens, these are switched. Returning to our thin lens, adding the two expressions above:
For a thin lens in air, n _{1} ≈ 1; n _{2} = n _{lens}, giving us the thin lens equation, or Lensmaker’s formula:
The focal length, f is given either by s _{o} or s _{i} → ∞ (it does not matter which):
We can then write the thin lens equation as (1/s _{o}) + (1/s _{i}) = 1/f, also known as the Gaussian Lens Formula. This is one the most important relations for the design of optical systems.
For example: Consider parallel rays incident on the flat side of a glass (n = 1.5) planoconvex lens with a radius of curvature of 50 mm. Where will these rays be focused to? Answer: R _{1} = ∞, R _{2} = −50 mm. 1/f = (1.5 − 1) (1/50 mm), so f = 100 mm, s _{i} = 100 mm. The rays will focus to a point 100 mm beyond the curved side of the lens.
Lenses magnify objects. The magnification can be >1 or <1. See Figure 3.17 depicting a thin lens, which is magnifying an extended object (i.e., not a point source)—in this case, a pear.
Figure 3.17 Magnification of an image by a lens. Rays emanating from point S 1 on the optical axis, are focused to point P1 (not drawn). Rays from S2 are focused to P2.
Points F_{o} and F _{i} are each a distance f, the focal length, from the lens. Consider light emanating from the top of the pear. The ray that goes through F _{o} will emerge from the lens parallel to the axis (think about why this is). The ray the leaves the pear parallel to the axis will go through F _{i}. The ray that goes through the center of the lens will be undeflected in the thin lens limit (Hecht 2002).
The magnification, M _{T}, is defined to be the height of the image relative to the height of the object—i.e., M _{T} ≡ y _{i}/y _{o}. Triangle S_{1}S_{2}O is similar to triangle P_{1}P_{2}O, so y _{o}/s _{o} = −y _{i}/s _{i}, so M _{T} ≡ −s _{i}/s _{o}; the negative sign shows that the image is inverted.
Triangle AOF_{i} is similar to triangle P_{1}P_{2}F_{i}, so y _{o}/f = |y _{i}|/(s _{i}−f).
Triangle BOF_{o} is similar to triangle S_{1}S_{2}F_{o}, so y _{o}/(s _{o} − f) = |y _{i}|/f. Combining these, (f/(s _{o} − f)) = ((s _{i} − f)/f). Note that s _{i} − f = x _{i} (see figure). Using the last similar triangle relation again, (f/(s _{o} − f)) = (−y _{i}/y _{o}). And so, M _{T} = (y _{i}/y _{o}) = −(x _{i}/f). We could also have written: M _{T} = −(f/x _{o}).
As the object distance x _{o} is lowered, the magnification increases. The reader may think about what happens if x _{o} < 0, i.e., the object is closer than the focal point. Drawing rays, convince yourself that the lens cannot form an image of the object.
Another important aspect of lenses that follows from the diagrams above, and for which it is useful to develop an intuition: Parallel rays that are also parallel to the optical axis are focused to the focal point on the optical axis. Parallel rays that are tilted with respect to the optical axis are also focused to a point at distance f, but off the optical axis; such points define the focal plane. The reader may wish to draw such rays.
When considering single-slit diffraction in Section 3.3, we realized that the angular resolution of a device is given by θ_{min} ≃ λ/a, where λ is the wavelength of light and a is the diameter of the imaging aperture. Two objects must have an angular separation of at least θ_{min} if they are to be resolved as separate objects. Using lenses to magnify objects, this angular resolution criterion still holds. Moreover, the fact that the object distance cannot be closer than the focal length turns our resolution relation into a distance criterion. We will briefly sketch this:
Consider two objects separated in position by Δy at a distance s from a lens (see Figure 3.18). For the two to be resolvable, we need θ > λ/a, where θ ≃ Δy/s. Therefore, we need Δy > sλ/a. Since s>f, f = (n _{1}/(n _{2} − n _{1}))R, and R > a, we can write s > (n _{1}/(n _{2} − n _{1}))a. Combining the two inequalities: Δy > (n _{1}/(n _{2} − n _{1}))λ. The numerical factor n _{1}/(n _{2} − n _{1}) ≈ 1 in our rough treatment. Therefore, our minimum resolvable spatial separation is Δy _{min} ≈ λ. We cannot resolve objects smaller than (approximately) the wavelength of light.
More precise statements of optical resolvability can be constructed. Typically, one invokes the Abbe criterion that the minimal Δy = λ/(2nsinθ), where n is the index of refraction of the medium and θ is the maximum angle over which light is collected. The value of sin θ is bounded by 1, so at best Δy = λ/(2n). For λ = 400 nm (blue) light in water (n = 1.3), the theoretical resolution limit is about 150 nm. (In practice, any aberrations or imperfections further reduce the resolution.) Hence, the wavelengths of visible light set a limit of roughly a few hundred nanometers as the minimal size of resolvable structures—smaller, for example, than cells but far larger than the characteristic sizes of proteins or small molecules.
It is important to keep in mind that the issues of resolution discussed above govern the discrimination of two (or more) objects. If one knows that only a single point source contributes to an image, giving an intensity profile like that of Figure 3.11, for example, the center of this profile can be determined to arbitrarily high precision (in practice, a few nanometers typically). A few of the many applications of this principle are illustrated in (Crocker and Grier 1996, Weihs et al. 2006, Crocker and Hoffman 2007, Roichman et al. 2008, Kong and Parthasarathy 2009).
Figure 3.18 Schematic, for considering spatial resolution.
Despite the fundamental nature of the diffraction-limited resolution, the past decade or so has seen the birth of several very clever techniques for surmounting it, using interferometry, nonlinear optical processes, or single-molecule imaging (e.g., Betzig et al. 2006, Bates et al. 2007, Hell 2007, Abbott 2009) to yield optical information at scales an order of magnitude smaller than what was traditionally thought possible.
The law of reflection (θ_{r} = θ_{i}, where r and i refer to reflected and incident rays; Figure 3.19a) and Snell’s law (n _{i} sin θ_{i} = n _{t} sin θ_{t}, where t refers to the transmitted ray) give the directions of reflected and transmitted rays at boundaries. What are the amplitudes of the electromagnetic waves? In other words, how much light is reflected and transmitted? Similar questions arise when considering other sorts of waves hitting boundaries—for example, waves on strings, incident at an interface between two media with different propagation speeds. In all these situations, transmission and reflection are analyzed by considering the boundary conditions imposed by the junction.
To consider the general case of a plane electromagnetic wave hitting a surface at some angle θ_{i} (with respect to the normal), we will have to separately consider the components with electric field perpendicular and parallel to the plane of incidence. (The incident, reflected, and transmitted rays all lie in and define the plane of incidence, “POI,” which also includes the normal to the surface.)
Recall from Section 3.2.4.2 some properties of electromagnetic waves:
The boundary conditions that govern electric and magnetic fields at the interface between media are
Let us consider the two cases.
Note that a circle with a dot in it indicates a vector that points out of the page towards you (see Figure 3.19b). The electric field vectors are completely tangential to the interface. The magnetic field vectors are not. Applying boundary condition (i) to the amplitudes (E _{0}) of the electric fields,
Applying boundary condition (ii) to the amplitudes (B _{0}) of the magnetic fields, $-\frac{{\displaystyle {B}_{0\text{i}}}}{{\displaystyle {\mathrm{\mu}}_{\text{i}}}}\text{cos}{\mathrm{\theta}}_{i}\frac{{\displaystyle {B}_{0\text{r}}}}{{\displaystyle {\mathrm{\mu}}_{\text{i}}}}\text{cos}{\mathrm{\theta}}_{\text{r}}=-\frac{{\displaystyle {B}_{0\text{t}}}}{{\displaystyle {\mathrm{\mu}}_{\text{t}}}}\text{cos}{\mathrm{\theta}}_{\text{t}}$
(see Figure 3.19 to understand the signs).Using B _{0} = E _{0}/v (from above), v _{i} = v _{r} (since they are in the same media), θ_{i} = θ_{r} (law of reflection), and v _{i} = c/n _{i}, we can write the above relation as
Combining this with the boundary condition (i) equation above, substituting to eliminate E_{0t}, we can solve for the ratio of the reflected wave amplitude to the incident wave amplitude:
Figure 3.19 Reflection and refraction at an interface. The incident wave (wavevector k → i ) is reflected (wavevector k → r ) and transmitted (wavevector k → t ). Both the angles of the reflected and transmitted waves and their amplitudes are determined by the dielectric properties of the materials that comprise the interface. (a) Electric and magnetic field vectors for light polarized with E → perpendicular to the plane of incidence. (b) Electric and magnetic field vectors for light polarized with E → parallel to the plane of incidence.
Similarly solving instead for the ratio of the transmitted wave amplitude to the incident wave amplitude,
Typically, one deals with nonmagnetic materials: μ ≈ μ_{0}, the permeability of free space. The above equations simplify, yielding two of the four Fresnel equations, for the amplitude reflection coefficient, r _{⊥} and the amplitude transmission coefficient, t _{⊥}.
Applying the boundary conditions to this geometry leads to (see Figure 3.19c):
and
For typical nonmagnetic media, we get the other two Fresnel equations:
Let us plot r _{⊥} and r _{||} as a function of θ_{i} for light incident from air (n _{ i } = 1) to water (n _{t} = 1.33)—see Figure 3.20. We notice something very interesting: a particular θ_{i} for which the reflection coefficient is zero for light with its electric field parallel the plane of incidence. There is no such angle for the perpendicular polarization.
Figure 3.20 Fresnel coefficients r ⊥ and r || for light incident from air to water, as a function of incidence angle.
The reader can work through the algebra and show that if n _{t}>n _{i}, r _{||} = 0 at one particular incident angle, θ_{i}. This angle is called Brewster’s angle, θ_{p}, and is given by tan θ_{p} = n _{t}/n _{i}. All the parallel-polarized light is transmitted. What about the perpendicular polarization? One can show that there is no angle that gives r_{⊥} = 0. Therefore, shining randomly polarized light incident at the Brewster angle, the reflected light is completely polarized with its electric field perpendicular to the plane of incidence. This property, together with a desk lamp, a sink, and a large bowl, once saved your author from misfortune, as he found himself with a much-needed linear polarizer whose transmission axis was unlabeled. The reader can test his or her mystery-solving skills by figuring out how he determined the polarizer’s axis and thereby saved the day.
The polarization-dependence of reflection at interfaces also underpins Brewster angle microscopy, in which an interface is imaged with parallel-polarized light incident at the Brewster angle of the two media. The presence of interfacial molecules distinct from those of the two media—for example, lipids organized at an air-water interface—alters the local index of refraction, leading to a nonzero reflection coefficient. The intensity of the reflected light therefore provides a sensitive measure of interfacial molecular organization.
In the preceding pages, we have explored the basic elements of optics. All of these topics can be explored much further, uncovering still more depth and beauty than we have been able to sketch, and also illuminating applications of great importance to science and technology. As will be evident throughout this book, these two aspects of optics—its formal elegance and its practical utility—are intertwined. Demands from fields as diverse as biological imaging, astronomy, and telecommunications drive the search for deeper insights into the behavior of light. Conversely, explorations of the intricacies of electromagnetic wave propagation have yielded, and will undoubtedly continue to yield, remarkable new tools.